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Those  of  us  involved  in  the  creation  of  the  Handbook  of  Artificial  Intelligence,  both 
writers  and  editors,  have  attempted  to  make  the  concepts,  methods,  tools,  and  main  results 
of  artificial  intelligence  research  accessible  to  a  broad  scientific  and  engineering  audience. 
Currently,  Al  work  is  familiar  mainly  to  its  practicing  specialists  and  other  interested 
computer  scientists.  Yet  the  field  is  of  growing  interdisciplinary  interest  and  practical 
importance.  With  this  book  we  are  trying  to  build  bridges  that  are  easily  crossed  by 
engineers,  scientists  in  other  fields,  and  our  own  computer  science  colleagues. 


In  the  Handbook  we  intend  to  cover  the  breadth  and  depth  of  Al,  presenting  general 
overviews  of  the  scientific  issues,  as  well  as  detailed  discussions  of  particular  techniques 
and  important  Al  systems.  Throughout  we  have  tried  to  keep  in  mind  the  reader  who  is  not  a 
specialist  in  Al. 

As  the  cost  of  computation  continues  to  fall,  new  areas  of  computer  applications 
become  potentially  viable.  For  many  of  these  areas,  there  do  not  exist  mathematical  "cores" 
to  structure  calculations!  use  of  the  computer.  Such  areas  will  Inevitably  be  served  by 
symbolic  models  and  symbolic  Inference  techniques.  Yet  those  who  understand  symbolic 
computation  have  been  speaking  largely  to  themselves  for  twenty  years.  We  feel  that  It  is 
urgent  for  Al  to  "go  public"  In  the  manner  intended  by  the  Handbook. 


Several  other  writers  have  recognized  a  need  for  more  widespread  knowledge  of  Al 
and  have  attempted  to  help  fill  the  vacuum.  Lay  reviews,  in  particular  Margaret  Boden's 
Artificial  Intelligence  and  Natural  Man,  have  tried  to  explain  what  is  important  and 
Interesting  about  Al,  and  how  research  in  Al  progresses  through  our  programs.  In  addition, 
there  are  a  few  textbooks  that  attempt  to  present  a  more  detailed  view  of  selected  areas 
of  Al,  for  the  serious  student  of  computer  science.  But  no  textbook  can  hope  to  describe  all 
of  the  sub-areas,  to  present  brief  explanations  of  the  important  Ideas  and  techniques,  and  to 
review  the  forty  or  fifty  most  Important  Al  systems. 


The  Handbook  contains  several  different  types  of  articles.  Key  Al  ideas  and  techniques 
are  described  in  core  articles  (e.g.,  basic  concepts  in  heuristic  search,  semantic  nets). 
Important  Individual  Al  programs  (e.g.,  SHRDLU)  are  described  in  separate  articles  that 
indicate,  among  other  things,  the  designer's  goal,  the  techniques  employed,  and  the  reasons 
why  the  program  is  important.  Overview  articles  discuss  the  problems  and  approaches  in 
each  major  area.  The  overview  articles  should  be  particularly  useful  to  those  who  seek  a 
summary  of  the  underlying  issues  that  motivate  Al  research. 


Eventually  the  Handbook  will  contain  approximately  two  hundred  articles.  We  hope  that 
the  appearance  of  this  material  will  stimulate  Interaction  and  cooperation  with  other  Al 
research  sites.  We  look  forward  to  being  advised  of  errors  of  omission  and  commission.  For  a 
field  as  fast  moving  as  Al,  It  Is  important  that  Its  practitioners  alert  us  to  important 
developments,  so  that  future  editions  will  reflect  this  new  material.  We  Intend  that  the 
Handbook  of  Artificial  Intelligence  be  a  living  and  changing  reference  work. 


The  Handbook  represents  the  work  of  many  graduate  students  at  Stanford  as  well  as 
students  and  Al  professionals  at  other  institutions,  Including  Rutgers  University,  SRI 
International,  Xerox  Palo  Alto  Research  Center,  MIT,  and  the  RAND  Corporation.  This  report  on 
research  toward  applying  Al  techniques  In  medical  systems  was  originally  drafted  at  Rutgers 
University  by  Victor  Ciesielski  and  his  colleagues  there.  James  Bennet  and  Paul  Cohen  at 
Stanford  continued  work  on  the  material.  Others  who  contributed  to  or  commented  on  earlier 
versions  of  this  section  include  Saul  Amarel,  Donald  Biesel,  Bruce  Buchanan,  Randall  Davis, 
Casimir  Kulikowsky,  Donald  Smith,  and  William  Swartout. 
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Foreword 


Those  of  us  involved  in  the  creation  of  the  Handbook  of  Artificial  Intelligence,  both 
writers  and  editors,  have  attempted  to  make  the  concepts,  methods,  tools,  and  main  results 
of  artificial  intelligence  research  accessible  to  a  broad  scientific  and  engineering  audience. 
Currently,  Al  work  is  familiar  mainly  to  its  practicing  specialists  and  other  interested 
computer  scientists.  Yet  the  field  is  of  growing  interdisciplinary  interest  and  practical 
importance.  With  this  book  we  are  trying  to  build  bridges  that  are  easily  crossed  by 
engineers,  scientists  In  other  fields,  and  our  own  computer  science  colleagues. 

In  the  Handbook  we  intend  to  cover  the  breadth  and  depth  of  Al,  presenting  general 
overviews  of  the  scientific  issues,  as  well  as  detailed  discussions  of  particular  techniques 
and  important  Al  systems.  Throughout  we  have  tried  to  keep  in  mind  the  reader  who  is  not  a 
specialist  in  Al. 

As  the  cost  of  computation  continues  to  fall,  new  areas  of  computer  applications 
become  potentially  viable.  For  many  of  these  areas,  there  do  not  exist  mathematical  "cores" 
to  structure  calculatlonal  use  of  the  computer.  Such  areas  will  inevitably  be  served  by 
symbolic  models  and  symbolic  Inference  techniques.  Yet  those  who  understand  symbolic 
computation  have  been  speaking  largely  to  themselves  for  twenty  years.  We  feel  that  it  is 
urgent  for  Al  to  "go  public"  In  the  manner  intended  by  the  Handbook. 

Several  other  writers  have  recognized  a  need  for  more  widespread  knowledge  of  Al 
and  have  attempted  to  help  fill  the  vacuum.  Lay  reviews,  in  particular  Margaret  Boden's 
Artificial  Intelligence  and  Natural  Man,  have  tried  to  explain  what  is  important  and 
Interesting  about  Al,  and  how  research  in  Al  progresses  through  our  programs.  In  addition, 
there  are  a  few  textbooks  that  attempt  to  present  a  more  detailed  view  of  selected  areas 
of  Al,  for  the  serious  student  of  computer  science.  But  no  textbook  can  hope  to  describe  all 
of  the  sub-areas,  to  present  brief  explanations  of  the  important  ideas  and  techniques,  and  to 
review  the  forty  or  fifty  most  important  A!  systems. 

The  Handbook  contains  several  different  types  of  articles.  Key  Al  ideas  and  techniques 
are  described  in  core  articles  (e.g.,  basic  concepts  in  heuristic  search,  semantic  nets). 
Important  individual  Al  programs  (e.g.,  SHRDLU)  are  described  in  separate  articles  that 
indicate,  among  other  things,  the  designer's  goal,  the  techniques  employed,  and  the  reasons 
why  the  program  Is  Important.  Overview  articles  discuss  the  problems  and  approaches  In 
each  major  area.  The  overview  articles  should  be  particularly  useful  to  those  who  seek  a 
summary  of  the  underlying  issues  that  motivate  Al  research. 


Eventually  the  Handbook  will  contain  approximately  two  hundred  articles.  We  hope  that 
the  appearance  of  this  material  will  stimulate  interaction  and  cooperation  with  other  Al 
research  sites.  We  look  forward  to  being  advised  of  errors  of  omission  and  commission.  For  a 
field  as  fast  moving  as  Al,  it  is  important  that  its  practitioners  alert  us  to  important 
developments,  so  that  future  editions  will  reflect  this  new  material.  We  intend  that  the 
Handbook  of  Artificial  Intelligence  be  a  living  and  changing  reference  work. 

The  Handbook  represents  the  work  of  many  graduate  students  at  Stanford  as  well  as 
students  and  Al  professionals  at  other  institutions,  including  Rutgers  University,  SRI 
International,  Xerox  Palo  Alto  Research  Center,  MIT,  and  the  RAND  Corporation.  This  report  on 
research  toward  applying  Al  techniques  in  medical  systems  was  originally  drafted  at  Rutgers 
University  by  Victor  Ciesielski  and  his  colleagues  there.  James  Bennet  and  Paul  Cohen  at 
Stanford  continued  work  on  the  material.  Others  who  contributed  to  or  commented  on  earlier 
versions  of  this  section  include  Saul  Amarel,  Donald  Biesel,  Bruce  Buchanan,  Randall  Davis, 
Casimir  Kulikowsky,  Donald  Smith,  and  William  Swartout. 


Avron  Barr  Stanford  University 

Edward  Feigenbaum  July,  1979 
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A.  Overview 


There  are  two  main  areas  where  Al  techniques  are  being  applied  in  medical  systems 
Although  the  application  of  pattern  recognition  and  scene  analysis  techniques  to  the 
Interpretation  of  x-ray  and  ultrasonic  Images  is  an  Increasingly  Important  diagnostic  tool,  this 
report  will  focus  on  another  area,  the  construction  of  consultation  programs  as  an  aide  in 
medical  decision  making. 

The  motivation  for  the  development  of  expert  computer-based  medical  consultation 
systems  Is  twofold:  First,  there  are  obvious  benefits  to  society  from  providing  reliable  and 
thorough  diagnostic  services— perhaps  even  at  a  reduced  cost.  It  has  been  observed 
(Ledley  &  Lusted,  1959)  that  most  of  the  errors  made  by  clinicians  are  errors  of  omission, 
that  is,  In  trying  to  Identify  the  disease  that  a  patient  is  suffering  from,  the  physician  does 
not  consider  all  of  the  possibilities,  thereby  missing  the  correct  diagnosis.  A  computer 
program  could  be  designed  to  exhaustively  consider  all  of  the  diseases  in  its  domain. 
Furthermore,  there  are  some  tasks  that  computers  can  perform  more  rapidly  and  accurately, 
such  as  calculating  doses  of  medicines,  particularly  In  cases  where  dosage  Is  critical  and 
many  factors  must  be  taken  Into  account  In  the  calculation  (as  in  digitalis  therapy,  see 
Article  C5).  There  are  also  some  tasks  that  physicians  are  notoriously  poor  at  performing  and 
that  are  routine  enough  for  the  computer  to  do,  such  as  the  prescription  of  antl-microbial 
therapy. 

The  second  motivation  for  development  of  these  systems  stems  from  current  interests 
in  computer  science.  Clinical  medicine  has  been  a  very  fertile  area  for  the  study  of  cognitive 
processes  ever  since  the  diagnostic  process  has  been  studied  extensively  (Jacquez,  1963). 
There  is  a  highly  developed  medical  taxonomy;  a  large,  relatively  well-organized  knowledge 
base;  and  a  number  of  human  experts  In  the  domain  whose  performance  Is  significantly 
better  on  hard  problems  that  that  of  the  average  practitioner  (l.e.,  there  is  an  Identifiable 
expertise).  Furthermore,  the  type  of  problem  solving  that  occurs  In  the  domain  is  repetitive. 
These  attributes  reflect  some  of  the  prerequisites  for  applications  of  a  developing  sub-field 
of  Al  known  as  knowledge  engineering~ta king  Al  beyond  the  stage  of  "toy"  problems  to 
confront  large,  real-world  problems  (Felgenbaum,  1977). 

Computer-based  consultation  brings  with  It  many  formidable  social,  psychological,  and 
ethical  problems  that  must  be  addressed  by  the  system  builders.  These  problems  include: 
validating  the  systems,  exporting  them  to  hospitals  and  clinics,  getting  physicians  and 
patients  to  accept  them,  and  deciding  the  responsibility  for  decisions  made  by  these 
systems. 

In  the  following  sections,  aspects  of  the  diagnostic  process  and  medical  decision 
making  will  be  discussed,  as  well  as  a  number  of  Al  Issues  related  to  the  representation  and 
manipulation  of  medical  knowledge. 


Medical  Decision  Making 

There  are  three  principal  parts  of  medical  decision  making:  data  gathering,  diagnosis, 
and  treatment  recommendation.  Data  gathering  is  concerned  with  obtaining  the  patient 
history  and  clinical  and  laboratory  data.  The  clinical  data  consist  of  symptoms,  which  are  the 
subjective  sensations  reported  by  the  patient— such  as  headache,  chest  pain,  etc.— and 
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signs,  which  are  objective  and  observable  by  the  physician  (Feinstein,  1967).  Manifestation 
refers  to  any  sign,  symptom,  or  finding.  Laboratory  results  generally  are  referred  to  as 
findings.  Diagnosis  is  the  process  of  using  this  data  to  determine  the  illness.  The  three 
aspects  are  not  independent;  disease  hypotheses  are  used  to  direct  further  information 
gathering,  while  treatment  recommendation  depends  on  the  diagnosis  and  generally  requires 
more  information  gathering.  Often,  the  decision  to  do  a  test  Includes  a  physician's  estimate 
of  the  cost,  both  in  terms  of  money  and  danger  to  the  patient,  which  is  weighed  against  the 
value  of  the  information  gained.  Gathering  information,  diagnosing  the  disease,  and  deciding 
on  a  treatment  regimen  constitute  a  consultation.  Figure  1  illustrates  this  process  in  relation 
to  the  course  of  the  disease. 


UNTREATED 


TREATED 


past 


-NOW- 


future 


Figure  1.  Consultation  process  depicting  current  state 
of  medical  knowledge. 


This  characterization  of  a  consultation  highlights  the  current  state  of  medical  knowledge. 
Etiology  refers  to  the  ultimate  causes  of  the  disease;  pathogenesis  refers  to  the  way  in 
which  the  disease  developed  from  its  causes.  A  consultation  proceeds  by  determining  the 
etiology.  A  treatment  is  then  formulated  for  the  identified  diseases  and  their  causes.  Often, 
however,  the  medical  knowledge  is  incomplete  and  it  is  not  possible  to  determine  the  causes 
of  a  disease.  In  these  cases,  treatments  must  be  based  only  on  the  knowledge  of  the 
symptoms  or  characteristics  of  the  diseases.  Some  diseases  are  very  well  understood  and 
knowledge  about  them  Is  based  on  various  kinds  of  models  and  specific  mechanisms.  Other 
diseases  are  not  very  well  understood  and  knowledge  about  them  is  only  associational;  for 
example,  treatment  Is  prescribed  on  the  basis  of  symptoms  of  closely  associated  diseases 
for  which  treatments  are  known. 


During  a  consultation  the  physician  performs  at  least  two  mental  processes:  reasoning 
and  judgment  (Ledley  &  Lusted,  1959).  Reasoning  involves  making  clinical  decisions  using 
various  formal  and  logical  techniques.  This  process  is  evident  primarily  in  the  diagnosis 
phase.  Judgment  has  come  to  mean  the  use  of  various  "intangibles"  such  as  general  feelings 
about  the  case  and  past  clinical  experience,  which  help  the  physician  to  make  clinical 
decisions.  These  are  evident  during  prognosis  and  therapy  recommendations.  Artificial 
intelligence  has  attempted  to  model  both  of  these  processes. 

There  are,  however,  some  aspects  of  consultations  that  computers  cannot  do,  such  as 
the  physical  examination.  The  physician  gains  much  firsthand  information  from  general 
appearance,  facial  expressions,  etc.,  that  is  inaccessible  to  the  computer.  The  design  of 
computer  consultation  systems  must,  therefore,  take  this  factor  Into  account  and  offer 
mechanisms  for  the  representation  of  these  types  of  information  secondhand. 
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History  of  Computers  in  Medicine 

The  use  of  computers  in  medical  decision  making  began  in  the  early  1960s  with  the 
implementation  of  programs  that  performed  well-known  types  of  statistical  analyses.  These 
programs  focused  on  the  diagnosis  aspect  of  the  consultation:  They  accepted  a  set  of 
findings  and  selected  one  disease  from  a  fixed  set,  using  methods  such  as  pattern 
recognition  through  discriminant  functions,  Bayesian  decision  theory,  and  decision  tree 
techniques  (Croft,  1972;  Nordyke,  Kulikowski,  &  Kulikowskl,  1971).  Slightly  more  complex 
programs  performed  "sequential  diagnosis."  Here,  when  there  is  not  enough  information  to 
make  a  reliable  diagnosis,  the  next  patient  test  (to  get  more  information)  Is  determined  by  a 
strategy  that  selects  the  "best"  test  based  on  three  factors:  the  cost  of  the  test,  the 
danger  to  the  patient,  and  the  amounts  of  discriminating  Information  needed  and  made 
available  by  the  test. 

The  appeal  of  using  statistical  methods  is  that  the  resulting  decisions  are  "optimar 
according  to  specified  criteria.  Unfortunately,  these  statistical  systems  proved 
unsatisfactory.  The  mathematics  that  they  have  been  based  upon  have  assumed  that  the 
patient  has  only  one  disease  and  that  the  data  are  not  erroneous.  More  fundamentally, 
certain  assumptions  and  simplifications  concerning  the  Independence  and  mutual  exclusivity 
of  various  disease  states  that  were  made  in  order  to  make  the  statistical  techniques 
practical  were  found  to  be  unjustified.  Furthermore,  many  prior  and  conditional  probabilities 
required  for  complete  analysis  were  simply  not  available. 

Since  the  early  1970s  there  has  been  an  increasing  application  of  Al  techniques  to 
performing  medical  decision  making.  Some  of  the  formalisms,  techniques,  and  languages 
developed  In  Al  were  directly  applicable  to  medicine  before,  but  the  new  understanding  of 
the  nature  of  the  task  called  for  new  ways  of  representing  knowledge  and  reasoning.  For 
example,  the  classical  Al  problem-solving  techniques  of  state-space  search  and  theorem 
proving  (see  Search)  were  not  directly  applicable.  Consider  a  simple  application  of  state- 
space  search  to  the  planning  of  a  treatment.  If  one  assumes  that  the  "initial  state"  is  the 
diseased  patient,  that  the  final  state  Is  the  "healthy  patient,"  and  that  the  "operators"  are 
various  drugs,  physical  therapies,  surgical  procedures,  etc.,  It  would  appear  that  simple 
search  would  find  a  path  between  the  initial  and  final  states.  But  there  are  two  fundamental 
problems.  First,  the  initial  state,  the  disease  of  the  patient,  Is  rarely  known  with  certainty. 
Second,  the  application  of  an  operator— l.e.,  a  treatment— Is  not  guaranteed  to  result  in  an 
expected  state.  In  order  to  deal  with  these  problems,  methods  for  representing  inexact 
knowledge  and  for  performing  plausible  reasoning  have  been  developed  in  each  of  the 
consultation  systems  described  below. 

From  the  standpoint  of  Al,  medical  diagnosis  is  a  hypothesis  formation  (see  article  C4) 
problem.  The  diagnosis  task  is  to  use  the  clinical  findings  to  form  a  consistent  set  of  disease 
hypotheses  (not  to  use  findings  to  select  one  disease  from  a  fixed  set  of  possible  diseases). 
These  hypotheses  are  typically  related  to  one  another  In  various  ways.  Each  existing  system 
exhibits  a  different  approach  to  this  hypothesis  formation  problem. 


The  Stata  of  the  Art 

The  state  of  the  art  In  computer-based  medical  decision  making  is  represented  by  the 
programs  described  in  the  following  articles.  These  programs  are  MYCIN  (Shortliffe,  1976), 
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INTERNIST  (Pople,  1975),  CASNET/GLAUCOMA  (Weiss,  Kulikowski,  &  Safir,  1978),  PIP 
(Szo lovits  &  Pauker,  1978),  IRIS  (Trigoboff  8.  Kulikowski,  1977),  and  the  Digitalis  Advisor 
(Silverman,  1976;  Swartout,  1977b).  There  are  now  several  other  programs  under 
development  that  use  the  techniques  and  Ideas  developed  in  the  above  systems.  These 
include  PUFF  (Feigenbaum,  1977),  a  pulmonary  function  program,  HODGKINS  (Safrans, 
Desforges,  &  Tslchlis,  1976),  a  system  for  performing  diagnostic  planning  for  Hodgkins 
disease,  and  HEAD-MED  (Heiser,  1977,  1978),  a  psychopharmacology  advisor.  During  the 
development  of  all  these  programs,  certain  Issues  arose  concerning  the  construction  of  the 
programs  and  their  acceptance  by  the  medical  community.  The  major  issues  and  the  ways  in 
which  these  were  addressed  by  the  individual  systems  are  also  described  below. 

Representation  of  knowledge.  Two  distinct  types  of  medical  knowledge  must  be 
represented:  (a)  general  knowledge  of  diseases,  manifestations,  causal  mechanisms,  etc., 
and  (b)  specific  knowledge  about  the  patient,  the  current  medical  history,  the  current 
therapies,  etc.  The  usual  representation  formalisms  of  Al--semantic  nets  (see  article 
RepresentetioruBa),  production  rules  (Representetion.B3),  frames  (R8presentetion.B7),  and 
predicate  calculus  (Representation^ )--are  not  directly  applicable  because  of  the  inexact 
nature  of  medical  knowledge.  In  all  of  the  consultation  systems  that  have  been  developed, 
these  representations  have  been  augmented,  for  example,  using  a  numerical  way  of 
expressing  strength  of  belief  or  strength  of  association.  For  example,  in  MYCIN,  the  medical 
knowledge  Is  represented  as  a  set  of  production  rules  augmented  by  "certainty  factors." 
These  certainty  factors  express  the  strength  of  belief  in  the  conclusion  of  a  rule,  given  that 
all  of  the  premises  are  true.  CASNET  uses  a  causal  network  representation  (basically  a 
semantic  network  with  the  one  relation,  CAUSES)  where  each  CAUSES  link  is  qualified  by  a 
number  that  represents  the  strength  of  causality.  In  INTERNIST,  a  taxonomy  of  diseases  is 
stored  as  a  huge  tree  with  each  node  representing  a  disease.  Associated  with  each  disease 
node  Is  a  list  of  manifestations,  with  numerical  weights  reflecting  the  strength  of  association 
between  the  disease  and  the  manifestation.  In  PIP,  the  frame  formalism  is  augmented  by 
numbers  that  reflect  both  the  strength  of  belief  in  a  slot  filler  and  the  degree  to  which  the 
frame  Itself  applies  to  this  patient.  In  IRIS,  where  the  semantic  net  and  production  rule 
formalisms  have  been  combined,  a  facility  for  Incorporating  an  arbitrary  representation  of 
strength  of  belief  has  been  Included.  Finally,  a  procedural  representation  is  used  in  the 
Digitalis  Advisors;  it  contains  a  mathematical  model  of  the  action  of  digitalis. 

Clinical  reasoning.  Clinical  reasoning  is  based  on  the  ways  different  pieces  of 
evidence  for  particular  hypotheses  are  combined.  Each  system  has  a  different  approach  to 
this  problem,  but  most  employ  the  technique  of  thresholding-,  If  the  numerical  score  of  a 
hypothesis  exceeds  a  certain  pre-set  threshold  (defined  by  the  expert  physician),  then  the 
hypothesis  Is  believed  to  be  true.  The  clinical  reasoning  of  MYCIN  involves  determining 
parameters  (e.g.,  the  infections  and  causative  organisms  of  a  patient)  using  production  rules. 
The  premises  of  a  rule  are  considered  true  if  the  combined  value  of  the  associated  certainty 
factors  exceeds  a  predefined  threshhold.  If  several  rules  contribute  to  a  conclusion  about  a 
parameter,  then  their  certainty  factors  are  functionally  combined  to  form  a  composite 
certainty  factor  for  this  conclusion.  These  confidence-factor  combining  functions  are  based 
on  probability  theory.  In  CASNET,  a  status  measure  is  associated  with  each  state  in  the 
causal  network.  Weights  are  propagated  both  in  the  forward  and  backward  direction 
depending  on  disease  causality.  A  state  is  considered  "confirmed"  if  its  status  exceeds  a 
specified  threshhold.  In  INTERNIST,  disease  hypotheses  are  scored  by  a  procedure  that 
takes  account  of  the  strength  of  association  among:  (a)  the  manifestations  exhibited  by  the 
patient  and  the  disease,  (b)  the  manifestations  associated  with  the  disease  that  are  not 
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present  In  the  patient,  (c)  and  the  confirmed  diseases  causally  related  to  this  one.  Disease 
hypotheses  are  ranked,  and  the  top-ranked  diseases  are  Investigated  further.  When  the 
difference  between  the  scores  of  the  top  two  disease  hypotheses  reaches  a  predefined 
criterion,  the  top  ranking  disease  is  confirmed.  PIP  combines  two  different  methods  of 
reasoning:  categorical  and  probabilistic.  Categorical  decisions  are  based  on  logical  criteria 
rather  that  numerical  values.  The  probabilistic  reasoning  involves  scoring  the  disease.  A 
frame  can  be  confirmed  either  on  logical  or  probabilistic  criteria.  In  IRIS,  an  attempt  Is  made 
to  confirm  nodes  of  a  semantic  net  as  being  true  for  the  patient.  Information  is  passed 
between  the  nodes  of  the  semantic  net  via  sets  of  production  rules  associated  with  the 
links.  These  production  rules  can  encode  both  logical  and  probabilistic  decisions. 

Explanation  and  justification.  The  explanation  and  the  justification  of  a  system's  line 
of  reasoning  are  important  factors  for  the  acceptance  of  consultation  systems  by  physicians. 
Explanation  involves  showing  the  user  the  line  of  reasoning  used  In  making  a  particular 
diagnosis;  justification  Is  concerned  with  the  medical  accuracy  and  reliability  of  the 
knowledge  and  the  reasoning  strategies  used. 

Only  two  systems  currently  address  the  Issue  of  explanation.  MYCIN  explains  a 
diagnosis  by  printing  out  an  English  version  of  the  chain  of  rules  used.  More  complex 
explanation  facilities  are  provided  by  TEIRESIAS  (Davis,  1977),  an  explanation  and 
knowledge  acquisition  system  developed  in  the  context  of  MYCIN.  The  OWL  Digitalis  Advisor 
provides  English  explanations  of  its  reasoning  that  are  generated  directly  from  the  OWL 
code.  The  detail  of  the  explanation  can  be  controlled  by  the  program.  Both  INTERNIST  and 
CASNET  are  able  to  summarize  the  consultation  by  displaying  scores  of  the  hypotheses  and 
statuses  of  states;  however,  they  are  unable  to  explain  the  methods  they  used  to  arrive  at 
these  scores. 

The  Issue  of  justification  is  a  complex  one.  Both  CASNET  and  MYCIN  can  cite 
references  to  the  research  literature  In  support  of  diagnoses  and  treatment 
recommendations.  CASNET  Is  able  to  provide  alternative  recommendations  based  on  differing 
expert  opinions.  At  the  heart  of  the  justification  Issue  is  the  accuracy  and  reliability  of  the 
expert's  knowledge  and  whether  this  knowledge  has  been  accurately  captured  in  the 
representation  formalism.  Often  medical  experts  have  differing  opinions,  and  It  Is  not  clear 
whether  a  consensus  should  be  sought  or  whether  the  different  opinions  should  all  be 
represented.  CASNET  and  MYCIN  have  been  developed  with  the  collaboration  of  groups  of 
experts,  and  the  rules  typically  represent  a  general  consensus  of  opinion.  The  other  systems 
were  developed  with  one  main  expert;  so  consensus  was  not  an  issue. 

Validation.  Just  as  the  various  instruments  and  drugs  used  by  physicians  must  be 
validated,  so  must  consultation  programs.  So  far,  CASNET  and  MYCIN  have  undergone 
relatively  extensive  clinical  trials  and  have  been  rated  as  "expert"  in  their  respective 
domains  by  human  experts.  INTERNIST  has  yet  to  undergo  formal  clinical  trials,  but  it  is 
Informally  rated  as  an  expert  In  internal  medicine.  The  Digitalis  Therapy  Advisors  have 
performed  well  in  limited  trials. 

Acquisition  of  knowledge.  Knowledge  acquisition  Is  the  transfer  of  the  experts' 
knowledge  and  expertise  to  the  program.  Currently  the  only  successful  way  of  doing  this  is 
through  a  knowledgeable  intermediary,  although  eventually  experts  should  be  able  to 
communicate  directly  with  the  consultation  program  (see  Article  B  on  the  TEIRESIAS  system). 


6 


Applications-orlented  Al  Research:  Medicine 


Concluding  Remarks 

Despite  the  extensive  work  that  has  been  done,  none  of  these  systems  is  in  routine 
clinical  use.  Physicians  have  not  for  the  most  part  accepted  them.  The  main  reason  is  that 
they  have  yet  to  satisfy  the  "Indispensability"  criterion:  They  are  not  indispensable  to  the 
practice  of  medicine  and  physicians  perform  adequately  without  them.  The  only  Al  program 
that  is  in  routine  medicai  use  is  PUFF,  a  pulmonary  function  program,  which  is  used  because  it 
saves  the  physician  a  lot  of  time.  Constructed  using  EMYCIN  (the  MYCIN  system  with  the 
knowledge  of  infectious  diseases  removed),  PUFF  uses  a  set  of  about  55  rules  about 
pulmonary  dysfunction.  The  program  suggests  treatment  recommendations  that  can  be 
overridden  by  the  physician. 

In  order  for  Al  programs  to  make  a  significant  impact  on  health  care,  at  least  in  the 
short  term,  It  appears  that  PUFF's  example  should  be  followed.  The  ingredients  for  a 
successful  application  in  medicine  seem  to  be  (a)  a  careful  choice  of  the  medical  problem 
and  (b)  the  cooperation  of  Interested  experts.  The  domain  must  be  narrow  and  relatively 
self-contained;  the  use  of  the  computer  should  aid,  not  replace,  the  physician;  and  the  task 
should  be  one  that  the  physician  either  cannot  do  or  is  willing  to  let  a  computer  do. 

To  summarize,  the  main  focuses  of  activity  in  the  area  of  medical  decision  making  today 
are:  knowledge  engineering,  the  acquisition  of  knowledge  from  experts;  knowledge 
representation,  for  building  and  maintaining  the  large  medical  knowledge  bases;  strategy 
design,  for  reasoning  with  the  medical  knowledge;  and  program  designs  that  feature 
explanation  capabilities,  of  their  reasoning  to  users. 


References 

Feigenbaum  (1977)  gives  a  short  review  of  this  area  of  research.  Most  of  the  work  on 
medical  systems  is  discussed  in  detail  in  the  AIM  Workshop  proceedings  (AIM,  1975-78). 
Recent  work  on  seme  of  the  Important  systems  Is  described  in  a  special  issue  of  the  Journal 
of  Artificial  Intelligence  (Sridharan,  1978). 


j 


B 


TEIRESIAS — Issues  in  Expert  Systems  Design 


7 


B.  TEIRESIAS— Issues  In  Expert  Systems  Design 

TEIRESIAS  is  a  system  for  facilitating  automatic  acquisition  and  maintenance  of  the 
large  knowledge  bases  used  by  expert  systems.  Although  TEIRESIAS  is  not  itself  an 
application  of  Al  to  some  domain,  it  deals  with  many  important  Issues  in  expert  systems 
design  that  are  relevant  to  all  of  the  programs  described  in  this  chapter.  The  system  was 
developed  by  Randall  Davis  as  part  of  his  doctoral  research  at  the  MYCIN  project  at 
Stanford,  and  this  article  assumes  some  familiarity  with  MYCIN's  rule-based  knowledge 
representation  scheme  and  Its  backward-chaining  control  structure  (see  Article  Cl).  However, 
the  ideas  and  techniques  that  TEIRESIAS  uses  are  not  necessarily  limited  to  MYCIN’s  domain 
of  infectious  diseases  or  to  the  production-rule  formalism  used  by  MYCIN. 


Knowledge-based  Programs 

As  discussed  in  the  Applicetions.Overview,  systems  that  achieve  expert-level 
performance  In  problem-solving  tasks  derive  their  power  from  a  large  store  of  task-specific 
knowledge.  As  a  result,  the  creation  and  management  of  large  knowledge  bases  and  the 
development  of  techniques  for  the  informed  use  of  knowledge  are  now  central  problems  of  Al 
research.  TEIRESIAS  was  written  to  explore  some  of  the  Issues  Involved  In  solving  these 
problems. 

Most  expert  programs  embody  the  knowledge  of  one  or  more  experts  in  a  field,  like 
infectious  diseases,  and  are  constructed  In  consultation  with  these  experts.  Typically,  the 
computer  scientist  mediates  between  the  experts  and  the  program  he  is  building  to  model 
their  expertise.  This  Is  a  difficult  and  time-consuming  task,  because  the  computer  scientist 
must  learn  the  basics  of  the  field  in  order  to  ask  good  questions  about  what  the  program  is 
supposed  to  do. 

tEIRESIAS's  goal  is  to  reduce  the  role  of  the  human  intermediary  in  this  task  of 
knowledge  acquisition,  by  assisting  In  the  construction  and  modification  of  the  system's 
database.  The  human  expert  communicates,  via  TEIRESIAS,  with  the  performance  program 
(e.g.,  MYCIN),  so  that  he  can  discover,  with  TEIRESIAS's  help,  what  the  performance  program 
is  doing  and  why.  TEIRESIAS  offers  facilities  for  modifying  or  adding  to  the  knowledge  base 
to  correct  errors:  Using  TEIRESIAS,  the  human  expert  can  "educate"  the  program  just  as  he 
would  tutor  a  human  novice  who  makes  mistakes.  Ideas  about  how  this  "debugging"  process 
Is  best  carried  out  are  at  the  core  of  TEIRESIAS's  success. 

TEIRESIAS  also  recognizes  the  inexact,  experiential  character  of  the  knowledge  that  is 
often  required  for  knowledge-based  systems  and  (as  examples  below  will  illustrate)  offers 
the  expert  some  assistance  in  formulating  new  "chunks  of  knowledge"  of  this  sort.  Another 
major  aim  of  the  system  was  to  provide  a  mechanism  for  embodying  strategic  information. 
Meta-rules  (discussed  below)  are  used  to  direct  the  use  of  object-level  rules  in  the 
knowledge  base  and  to  provide  a  mechanism  for  encoding  problem-solving  strategies. 


Interactive  Transfer  of  Expertise. 

it  Is  an  established  result  that  an  expert  knows  more  about  a  field  than  he  is  aware,  or 
capable  of  articulating  completely.  Thus,  asking  him  a  broad  question  like  "Tell  me  everything 
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you  know  about  staph-infections"  will  yield  only  a  fraction  of  his  knowledge.  TEIRESIAS's 
approach  is  to  present  the  expert  with  some  errors  made  by  an  already  established,  but  still 
incomplete,  knowledge-based  program  and  to  ask  a  focused  question:  "What  do  you  know  that 
the  program  doesn't  know,  which  makes  your  expert  diagnosis  different  In  this  case?" 

This  Interaction  is  called  transfer  of  expertise:  TEIRESIAS  incorporates  into  the 
performance  program  the  capabilities  of  the  human  expert.  TEIRESIAS  does  not  attempt  to 
derive  new  information  on  Its  own  but,  instead,  tries  to  "listen"  as  attentively  and 
intelligently  as  possible,  to  help  the  expert  augment  or  modify  the  knowledge  base. 

Interactive  transfer  of  expertise  between  an  expert  and  an  expert  program  begins 
when  the  expert  identifies  an  error  In  the  performance  of  the  program  and  invokes  TEIRESIAS 
to  help  track  down  and  correct  the  error.  Errors  are  manifest  as  program  responses  that  the 
expert  would  not  have  made  or  as  "lines  of  reasoning"  that  the  expert  finds  odd, 
superfluous,  or  otherwise  inappropriate.  The  first  kind  of  error  might  be,  for  example,  a 
wrong  conclusion  about  the  identity  of  a  bacteria.  On  the  other  hand,  the  performance 
program  may  just  ask  the  expert,  during  a  consultation,  a  question  that,  In  the  expert's 
opinion,  does  nothing  to  resolve  the  identity  of  the  bacteria.  This  is  an  example  of  the  "line 
of  reasoning"  type  of  error. 

Both  kinds  of  error  are  assumed,  by  TEIRESIAS,  to  be  indicative  of  a  deficit,  or  "bug,"  in 
the  performance  program's  knowledge  base.  Transfer  of  expertise  begins  when  TEIRESIAS  Is 
called  upon  to  correct  the  deficit.  TEIRESIAS  fixes  bugs  in  the  knowledge  base  by: 

1 .  Stopping  the  performance  program  when  the  human  expert  identifies  an  error. 

2.  Working  backwards  through  the  steps  in  the  performance  program  that  led  to 
the  error,  until  the  bug  Is  found. 

3.  Helping  the  expert  fix  the  bug  by  adding  or  modifying  knowledge. 

To  Identify  faulty  reasoning  steps  In  the  performance  program,  the  expert  can  use  the  WHY 
and  HOW  commands  to  ask  TEIRESIAS  to  back  up  through  previous  steps,  explaining  why  they 
were  taken.  The  same  explanatory  abilities  can  also  be  used  when  there  is  no  bug,  to  help 
the  user  follow  the  system's  line  of  reasoning.  Since  many  large  performance  programs  carry 
out  very  complex  Inferences  that  are  essentially  "hidden"  from  the  person  using  the  program, 
this  Is  a  valuable  facility. 


Meta-level  Knowledge 

One  of  the  principal  problems  of  Al  Is  the  question  of  appropriate  representation  and 
use  of  knowledge  about  the  world  (see  Representation).  Numerous  techniques  have  been 
used  to  represent  domain  knowledge  In  various  applications  programs.  A  central  theme  of  the 
research  on  TEIRESIAS  is  exploring  the  use  of  meta-knowledge.  Meta-level  knowledge  is 
simply  the  representation  In  the  program  of  knowledge  about  the  program  itself— about  how 
much  It  knows  and  how  It  reasons.  This  knowledge  Is  represented  using  the  same 
representation  techniques  used  to  represent  the  domain  knowledge,  yielding  a  program 
containing  object-level  representations  describing  the  external  world  and  meta-level 
representations  that  describe  the  internal  world  of  the  program,  its  self-knowledge.  For 
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example,  many  Al  programs  use  the  notion  of  a  frame  to  represent  the  knowledge  used  by  the 
system  (see  Article  RepreaentatioaB?).  One  can  imagine  a  meta-level  frame  that  describes 
the  structure  of  all  frames  In  the  system  or  one  that  denotes  the  different  classes  of  frames 
used  In  the  system.  One  of  TEIRESIAS's  representations  is  very  close  to  this  notion,  the 
schema  described  below. 

Meta-level  knowledge  has  taken  several  different  forms  as  Its  uses  have  been 
explored,  but  It  can  be  summed  up  as  "knowing  about  what  you  know."  In  general,  it  allows 
the  system  both  to  use  Its  knowledge  directly  and  to  examine  It,  abstract  it,  and  direct  its 
application.  The  capabilities  for  explanation,  knowledge  acquisition,  and  strategic  reasoning 
In  TEIRESIAS  inspired  the  incorporation  of  explicit  meta-level  knowledge,  and  these 
capabilities  are  based  on  the  use  of  that  knowledge. 


Explanation 

There  are  two  important  classes  of  situations  where  expert  systems  should  be  able  to 
explain  their  behaviour  and  results.  For  the  user  of  the  system  who  needs  clarification  or 
reassurance  about  the  system's  output,  the  explanation  can  contribute  to  the  transparency 
and  thus  the  acceptance  of  the  system.  The  second  major  need  for  explanation  is  in  the 
debugging  process  described  above,  where  a  human  expert  uses  the  system's  explanations 
of  why  it  has  done  what  it  has  done,  in  order  to  locate  some  error  In  the  database.  The  first 
of  these  applications  of  explanation  has  been  explored  in  the  question-answering  facility  of 
the  MYCIN  system;  the  explanation  capability  In  TEIRESIAS  has  explored  both  uses  but  has 
concentrated  on  the  latter. 

The  techniques  used  In  TEIRESIAS  for  generating  explanations  are  based  on  two 
assumptions  about  the  performance  program  being  examined,  namely,  (a)  that  a 
recapitulation  of  program  actions  can  be  an  effective  explanation,  as  long  as  the  correct 
level  of  detail  is  chosen,  and  (2)  that  there  is  some  shared  framework  for  viewing  the 
program's  actions  that  will  make  them  comprehensible  to  the  user.  In  the  MYCIN-like  expert 
systems  that  use  production-rule  knowledge  bases,  these  assumptions  are  valid,  but  it  is 
easy  to  Imagine  expert  systems  where  one  or  both  are  violated.  For  example,  the  first 
assumption  simplifies  the  explanation  task  considerably,  since  It  means  that  the  solution 
requires  only  the  ability  to  record  and  play  back  a  history  of  events.  This  assumption  rules 
out,  In  particular,  any  need  to  simplify  those  events.  However,  It  Is  not  obvious,  for  Instance, 
that  an  appropriate  level  of  detail  can  always  be  found.  Furthermore,  It  is  not  obvious  how 
this  approach  of  recapitulation,  which  often  offers  an  easily  understood  explanation  in 
programs  that  reason  symbolically,  would  be  applied  to  expert  systems  that  perform  primarily 
numeric  computations. 

A  simple  recapitulation  will  be  an  effective  explanation  only  If  the  level  of  descriptive 
detail  Is  constrained.  It  must  be  detailed  enough  that  the  operations  the  system  cites  are 
comprehensible;  the  conceptual  level  must  be  high  enough  that  the  operations  are  meaningful 
to  the  observer,  so  that  unnecessary  detail  is  suppressed;  and  it  must  be  complete  enough  so 
that  the  operations  cited  are  sufficient  to  account  for  all  behavior. 

The  second  assumption  concerns  the  user's  comprehension  of  the  expert  system's 
activity,  which  depends  on  the  fundamental  mechanism  used  by  the  program  and  the  level  at 
which  It  Is  examined.  Consider  a  program  that  does  medical  diagnosis  using  a  statistical 
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approach  based  on  Bayes's  Theorem.  It  is  difficult  to  imagine  what  explanation  of  its  actions 
the  program  could  give  if  it  were  queried  about  computed  probabilities.  No  matter  what  level 
of  detail  is  chosen,  such  a  program's  actions  are  not  (nor  were  they  intended  to  be)  a  model 
of  the  reasoning  process  typically  employed  by  physicians.  Although  they  may  be  an 
effective  way  for  the  computer  to  solve  the  diagnosis  problems,  there  is  no  easy  way  to 
interpret  these  actions  In  terms  that  will  make  them  comprehensible  to  humans  unacquainted 
with  the  program. 

Thus,  the  lack  of  mechanisms  for  simplifying  or  reinterpreting  computation  means  that 
TEIRESIAS's  approach  is  basically  a  first-order  solution  to  the  general  problem  of  explanation. 
But,  in  the  context  of  a  MYCIN-like  expert  system,  for  which  TEIRESIAS  was  designed,  the 
simple  AND/OR  goal  tree  control  structure  offers  a  basis  for  explanations  that  typically 
needs  little  additional  clarification.  (The  operation  of  TEIRESIAS's  explanation  facility  is 
illustrated  in  the  sample  protocol  at  the  end  of  this  article.)  The  invocation  of  a  rule  is  taken 
as  the  fundamental  action  of  the  system.  This  action,  within  the  framework  of  the  goal  tree, 
accounts  for  enough  of  the  system's  operation  to  make  a  recapitulation  of  such  actions  an 
acceptable  explanation.  In  terms  of  the  constraints  noted  earlier,  it  Is  sufficiently  detailed— 
the  actions  performed  by  a  rule  in  making  a  conclusion,  for  instance,  correspond  closely 
enough  to  the  normal  connotation  of  that  word— that  no  more  detailed  explanation  is 
necessary.  The  explanation  Is  still  at  a  high  enough  conceptual  level  that  the  operations  are 
meaningful  and  the  explanation  is  complete  enough— there  are  no  other  mechanisms  or 
sources  of  information  that  the  observer  needs  to  know  in  order  to  understand  how  the 
program  reached  Its  conclusions. 


Knowledge-acquisition:  Rule  Models  and  Schemata 

When  the  expert  has  identified  a  deficit  In  the  knowledge  base  of  the  performance 
program,  TEIRESIAS  questions  him  in  order  to  correct  the  deficit.  This  process  relies  heavily 
on  meta-level  knowledge  about  the  performance  program,  encoded  in  rule-models  and 
schemata.  In  other  words,  TEIRESIAS  knows  about  what  the  performance  program  knows. 

The  meta-level  knowledge  about  objects  In  the  domain  includes  both  structural  and 
organizational  information  and  Is  specified  In  data  structure  schemata.  Acquisition  of  knowledge 
about  new  objects  proceeds  as  a  process  of  instantiating  a  schema-creating  the  required 
structural  components  to  build  the  new  data  structure  and  then  attending  to  its  interrelations 
with  other  data  structures.  By  making  Inquiries  In  a  simple  form  of  English  about  the  values 
of  the  schema's  components,  this  knowledge  acquisition  process  is  made  to  appear  to  the 
expert  as  a  natural,  high-level  inquiry  about  the  new  concept.  The  process  is,  of  course, 
more  complex,  but  the  key  component  is  the  system’s  description  of  Its  own  representation. 

TEIRESIAS's  rule  models  are  empirical  generalizations  of  subsets  of  rules,  indn.'tinq 
commonalities  among  the  rules  In  that  subset.  For  example,  in  MYCIN  there  is  a  rule  model  for 
the  subset  of  rules  that  conclude  affirmatively  about  organism  category,  indicating  that  most 
such  rules  mention  the  concepts  of  culture  site  and  infection  type  in  their  premise.  Another  rule 
model  notes  that  those  rules  that  mention  site  and  infection  type  in  the  premise  also  tend  to 
mention  the  portal  of  entry  of  the  organism. 

This  knowledge  about  the  contents  of  the  domain  rules  is  used  by  TEIRESIAS  to  build 
expectations  about  the  dialogue.  These  expectations  are  used  to  facilitate  the  process  of 
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translating  the  English  statements  into  the  performance  program's  internal  representation 
and  to  identify  information  missing  from  the  expert's  entry.  An  example  of  TEIRESIAS’s  use 
of  rule  models  In  Its  knowledge  acquisition  dialogue  is  given  in  the  sample  protocol  below. 


Meta-rules  and  Performance  Strategies 

In  performance  programs  with  sufficiently  small  knowledge  bases  (like  MYCIN's), 
exhaustive  Invocation  of  the  relevant  parts  of  the  knowledge  base  during  a  consultation  is 
still  computationally  feasible,  in  time,  however,  with  the  inevitable  construction  of  larger 
knowledge  bases,  exhaustive  Invocation  will  prove  too  slow.  In  anticipation  of  this 
eventuality,  meta-rules  are  implemented  In  TEIRESIAS  as  a  means  of  encoding  strategies  that 
can  direct  the  program's  actions  more  selectively  than  can  exhaustive  invocation.  The 
following  meta-rule  Is  from  MYCIN's  Infectious  disease  domain: 


METARULE  001 

If  1)  the  Infection  Is  a  pelvic-abscess,  and 

2)  there  are  rules  which  mention  in  their 
premise  enterobacteriaceae,  and 

3)  there  are  rules  which  mention  in  their 
premise  gram  positive  rods. 

Then  There  is  suggestive  evidence  (.4)  that  the  rules 
dealing  with  enterobacteriaceae  should  be  evoked 
before  those  dealing  with  gram  positive  rods. 

This  rule  suggests  that  since  enterobacteriaceae  are  commonly  associated  with  a  pelvic 
abscess,  it  is  a  good  idea  to  try  rules  about  them  first,  before  the  less  likely  rules  mentioning 
gram  positive  rods.  Note  that  this  meta-rule  does  not  refer  to  specific  object-level  rules. 
Instead  it  specifies  certain  attributes  of  the  rules  it  refers  to,  for  example,  that  they  mention 
In  their  premise  enterobacteriaceae. 


An  Example:  TEIRESIAS  In  the  Context  of  MYCIN 

We  will  now  Illustrate  TEIRESIAS's  operation  in  affiliation  with  the  MYCIN  system  (see 
Article  Cl),  paying  particular  attention  to  TEIRESIAS's  explanation  and  knowledge  acquisition 
facilities.  MYCIN  provides  the  physician  with  advice  about  the  diagnosis  and  drug  therapy  for 
bacterial  Infections.  The  system  asks  questions  about  the  patient,  the  Infection,  the 
cultures  grown  from  specimens  from  the  patient,  and  any  organisms  (bacterium)  growing  in 
the  culture.  (Typically,  of  course,  the  exact  identity  of  the  organism  is  not  yet  known.) 

MYCIN's  database  is  composed  of  rules  that  specify  a  situation  (involving  information 
about  the  patient,  culture,  and  organism)  and  the  conclusions  that  can  be  drawn  in  that 
situation.  For  example,  to  conclude  whether  a  patient  suffers  from  a  bacteria  in  the 
enterobacteriaceae  category,  MYCIN  Invokes  Rule  95  of  its  knowledge  base: 
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Rule  095 

IF  the  site  of  the  culture  Is  blood,  and 
the  gram  stain  Is  positive,  and 
the  portal  of  entry  Is  gastrointestinal  tract,  and 
[A-the  abdomen  is  the  locus  of  Infection,  or 
B-the  pelvis  Is  the  locus  of  Infection] 

THEN  there  Is  strongly  suggestive  evidence  that 

enterobacteriaceae  Is  the  class  of  organisms  for 
which  therapy  should  cover. 

MYCIN  asks  questions  of  the  physician  to  determine  which  rules  are  applicable  in  the  current 
case — i.e.,  to  determine  the  "situation."  To  determine  if  the  organism  is  of  type 
enterobacteriaceae,  MYCIN  would  inquire  whether  each  clause  in  Rule  95  were  true. 

The  procedure  of  working  backwards  from  the  conclusion  to  the  situation  that  would 
make  the  conclusion  true  is  called  backward  chaining  and  is  MYCIN's  fundamental  control 
structure.  Backward  chaining  can  go  "several  levels  deep,"  meaning  that  the  clauses  in  a 
situation  that  must  be  true  (for  the  conclusion  to  be  true)  might  themselves  be  conclusions  of 
other  rules.  For  example,  a  condition  on  the  identification  of  enterobacteriaceae  is  that  the 
infection  be  localized  In  the  abdomen  or  the  pelvis;  but  this  condition  is  itself  a  conclusion, 
and  it  Is  only  true  if  its  conditions  (that  the  culture  from  these  places  is  recent,  and  is 
therapeutically  significant)  are  true.  Furthermore,  the  condition  of  therapeutic  significance 
has  its  own  conditions,  and  so  on.  Here  Is  a  protocol  showing  MYCIN's  basic  style  of 
interaction,  the  human  expert’s  responses  are  prefixed  with  **: 


- Patient  281 - 

1 )  Patient's  name: 

**  Albert  Hall 

2)  Sex: 

**  Male 

3)  Age: 

**  34 

4)  Have  you  been  able  to  obtain  positive  microbiological 
Information  about  a  possible  infection  of  Albert  Hall? 

**  Yes 


- Infection  1 - 

5)  What  is  the  infection? 

**  Primary-Bacteremia 

6)  Please  state  when  signs  and  symptoms  of  infection- 1  first  appeared: 
**  1 5-Dec-76  12:00 

- Culture  1 — - - — - 

7)  From  what  site  was  the  specimen  for  culture-1  taken 
**  Blood 

8)  Please  give  the  date  and  time  when  this  blood  culture  was  obtained: 
"*  1 6-Dec-76  12:00 

The  first  significant  organism  from  this  blood  culture  will  be 
referred  to  as: 
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- Organism-1 - 

9)  Enter  the  identity  of  ORGANISM-1 
**  Unknown 

10)  Is  ORGANISM-1  a  rod,  or  coccus,  (etc.): 

**  Rod 

1 1)  The  gram-stain  of  ORGANISM-1: 

**  Gram  negative 

12)  Did  ORGANiSM-1  grow  in  the  aerobic  bottle? 

**  Yes 

13)  Did  ORGANISM-1  grow  In  the  anaerobic  bottle? 
**  Yes 


My  therapy  recommendations  will  be  based  on  the  following  probable 
infection(s)  and  potential  causative  organlsm(s): 

INFECTION-1  Is  PRIMARY-BACTEREMIA 
<ltem-1>  E.COLI  [ORGANISM-1] 

<ltem-2>  PSEUDOMONAS-AERUGINOSA  [ORGANISM-1] 


FIGURE  1.  PART  OF  A  DIALOGUE  WITH  MYCIN. 

In  sum,  MYCIN  constructs  a  goal  tree  (see  Article  SearckProblem.Reduction)  of  questions  that 
must  be  resolved  in  the  course  of  the  consultation  to  conclude  the  Identity  of  a  Dacteria. 
When  it  cannot  resolve  a  question  by  inference  from  what  It  knows  already,  it  asks  the 
respondent  to  provide  an  answer.  Each  node  of  the  goal  tree  has  subnodes  that  result  from 
the  application  of  a  rule: 


sites 

blood 


Figure  1.  A  MYCIN  Goal  Tree 


14 


Applications-oriented  A(  Research:  Medicine 


Explanation:  HOW  and  WHY.  Suppose  that  during  a  consultation  with  MYCiN  the 
clinician  wants  an  explanation  of  the  reasoning  behind  some  question  that  MYCIN  has  asked 
him.  The  explanation  given  by  TEIRESIAS  In  response  to  the  physician's  question,  WHY, 
involves  backing  up  the  goal  tree  to  the  previous  node  and  presenting  the  rule  used  to 
derive  the  subnode  from  its  ancestor: 

MYCIN:  What  is  the  suspected  portal  of  entry  of  ORGANISM-1  into 
this  sterile  site? 

DoctonWHY 

TEIRESIAS:  [i.e.  why  is  It  important  to  determine  the  suspected 
portal  of  entry  of  ORGANISM- 1  into  this  sterile  site?] 

[3.0]...in  order  to  find  out  about  an  organism  or  class 
of  organisms  for  which  therapy  should  cover. 

It  has  already  been  established  that 
[3.1  ]  the  site  of  the  culture  is  blood 

[3.2]  the  stain  of  the  organism  is  gram  positive 
Therefore  if: 

[3.3]  the  portal  of  entry  of  the  organism  Is  g.i. 

[3.4]  [Abdomen  is  the  locus  of  infection,  or 
Pelvis  is  the  locus  of  infection] 

Then 

There  is  strongly  suggestive  evidence  (.9)  that 
enterobacteriaceae  is  the  class  of  organisms  for 
which  therapy  should  cover. 

[Rule  095] 


In  contrast  to  WHY,  which  spells  out  "higher"  nodes  in  the  goal  tree,  the  HOW  command 
asks  TEIRESIAS  to  explain  the  nodes  that  have  been  or  will  be  expanded  lower  down  the 
tree.  For  example,  In  the  following  exchange,  the  expert  asks  how  MYCIN  will  determine 
whether  the  abdomen  or  pelvis  is  the  locus  of  infection: 

Doctor:  HOW[3.4]  note:  3  A  is  the  prefix  of  the  statement 

of  interest  in  the  exchange  above 

TEIRESIAS:  [I.e.  how  were  you  determining  that  the  Abdomen  is  the 

locus  of  infection,  or  that  the  Pelvis  is  the  locus  of  infection?] 

[4.0]  At  that  point  Rule  021  was  being  used. 

If  the  doctor  now  wants  to  see  Rule  021,  he  asks  TEIRESIAS  to  explain  HOW  It  was  used, 
referring  to  the  prefix  (4.0)  of  the  statement  In  which  it  was  mentioned: 

Doctor:  HOW  [4.0] 

TEIRESIAS:  [i.e.  how  was  Rule  021  used?] 

[5.0]  It  has  already  been  established  that 
[5.1]  the  culture  is  recent. 
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Therefore  if: 

[5.2]  there  is  therapeutically  significant  disease 
associated  with  the  occurrence  of  this  organism 
then 

It  is  definite  (1.0)  that  the  site  of  the  culture  is 
the  locus  of  infection  in  tne  patient. 


The  explanation  facilities  of  TEIRESIAS  involve  climbing  up  and  down  MYCIN's  goal  tree 
explaining  WHY  questions  are  asked  and  HOW  they  are  answered.  If  MYCIN  makes  a 
misdiagnosis,  these  same  facilities  are  used  (either  by  the  expert  or  TEIRESIAS,  at  the 
choice  of  the  expert)  to  back  up  through  the  rules  that  achieved  the  diagnosis  until  the 
expert  has  found  the  faulty  one  or  determined  that  one  Is  missing.  TEIRESIAS  is  then  used  to 
expedite  the  alteration  or  construction  of  rules  by  the  expert. 

Knowledge  acquisition.  TEIRESIAS  is  able  to  help  an  expert  formulate  rules  in  English 
because  It  has  expectations  about  what  rules  "look  like.”  The  expectations  are  stored 
internally  In  the  form  of  rule  models  that  are  formed  automatically  by  examining  all  the  rules  in 
the  knowledge  base.  TEIRESIAS  knows,  for  example,  that  there  is  a  class  of  rules  that  have 
as  their  conclusions  the  probable  Identity  of  a  bacteria  and  that  these  rules  typically  have 
as  conditions  (a)  the  site  of  the  culture,  (b)  the  stain  of  the  culture,  and  (c)  the  probable 
portal  of  entry  of  the  bacteria.  These  rule  models  help  TEIRESIAS  translate  English  versions 
of  new  rules  into  a  machine  representation  acceptable  to  MYCIN. 

To  illustrate  TEIRESIAS's  debugging  and  correction  facilities,  we  will  continue  with  the 
consultation  from  Figure  1.  Suppose  that  after  MYCIN  presents  the  expert  with  its  diagnosis, 
TEIRESIAS  steps  in  and  checks  whether  the  expert  agrees: 

MYCIN:  My  therapy  recommendation  will  be  based  on  the  following 
probable  infection(s)  and  potential  causative  organism(s): 

INFECTION-1  Is  PRIMARY  BACTEREMIA 
<ltem-1  >  E.COLI  [ORGANISM-1] 

<ltem-2>  PSEUDOMONAS-AERUGINOSA  [ORGANISM-1] 


Nous  TEIRESIAS  steps  in. 


Are  all  these  identities  correct? 


T he  expert's  response,  marked  with  **,  is  No. 

**  No. 

Are  there  any  Identities  missing? 

**  Yes. 

Which  ones? 

**  KLEBSIELLA-PNEUMONIAE,  PROTEUS-NON-MIRABILIS 
Are  there  any  Identities  which  shouldn't  have  appeared? 

*«  y|t 
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Which  one(s)? 

**  PSUEOOMONAS-AERUGINOSA 
Do  you  want  to 

1  ]  track  down  the  problem  yourself,  or 

2]  allow  me  to  guide  the  debugging,  or 

3]  ignore  it  for  now  and  fix  It  another  time. 

Type  1 ,2,  or  3. 

**  2 

If  the  expert  had  typed  l,  he  would  have  used  the  HOW  and  WHY  commands  to  debug 
the  problem,  as  discussed  above. 

We'll  start  with  PSEUDOMONAS-AERUGINOSA  first: 


TEIRESIAS  starts  debugging  by  trying  to  find  out  why  an  inappropriate  diagnosis 
occurred.  Frequently  solving  a  problem  will  make  others  " go  away"  as  well. 


The  following  rules  were  used  in  deducing  that  the  Identity  of 
ORGANISM- 1  is  PSEUDOMONAS-AERUGINOSA: 

Rule  184 

Since 

[1.1]  the  category  of  ORGANISM-1  Is  not  known 

[1.2]  the  gram  stain  of  ORGANISM-1  is  gram  negative 

[1.3]  the  morphology  of  ORGANISM-1  Is  rod 

[1.4]  the  aeroblcity  of  ORGANISM-1  is  facultative 

There  Is  weakly  suggestive  evidence  (.3)  that  the  Identity  of 
ORGANISM- 1  Is  PSEUDOMONAS-AERUGINOSA. 

Is  this  rule  correct 


Note  that  this  rule,  which  resulted  in  the  wrong  conclusion,  is  not  necessarily  an  incorrect 
rule.  It  might  be  (and  in  fact  is)  the  case  that  it  was  incorrectly  invoked  because  one  or 
more  of  its  conditions  are  true  when  they  should  be  false. 


**  Yes 

T he  expert  says  the  rule  is  correct. 


Should  its  premise  have  failed  for  this  case? 
**  Y as 


m- 
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Which  clause  of  the  premise  should  have  failed? 

**  Clause  1. 

The  expert  believes  that  MYCIN  should  have  known  the  category  of  ORGANISM-1. 
MYCIN  failed  to  establish  the  category,  causing  Rule  184  to  be  used. 


Should  It  have  been  possible  to  conclude  a  category? 
**  Yes 
Which  one? 

**  Enterobacteriaceae 


Our  old  friend.  But  recall  (hat  MYCIN’s  rule  095  about  enterobacteriaceae  requires  a 
positive  gram  stain,  but  it  can  be  seen  from  Fig.  1  that  it  has  a  negative  gram  stain.  T hus. 
Rule  95  was  never  evoked.  In  the  course  of  more  debugging,  it  transpires  that  MYCIN 
has  a  couple  other  rules  dealing  with  this  category  of  bacteria,  but  they  weren’t  evoked 
either.  The  expert  decides  that  another  rule  is  required,  and  we  pick  up  the  dialogue 
where  he  is  proposing  one  to  TEIRESIAS: 


If:  1  **  THE  PATIENT'S  INFECTION  IS  PRIMARY-BACTEREMIA 

2  **  THE  SITE  IS  ONE  OF  THE  STERILE  SITES 
Then:  **  THERE  IS  EVIDENCE  (.8)  THAT  THE  CATEGORY  IS  ENTEROBACTERIACEAE 


TEIRESIAS  parses  this  rule  into  its  internal  representation  and  then  translates  it  back 
into  English,  so  the  expert  can  check  it.  We  pick  up  the  dialogue  after  the  expert  has 
agreed  on  the  parse.  TEIRESIAS  is  applying  its  rule  model. 


I  hate  to  criticize,  Dr.  Davis,  but  did  you  know  that  most  rules  about 
what  the  category  of  an  organism  might  be,  that  mention- 
the  site  of  a  culture 
the  infection 
ALSO  mention- 

[A]-  the  portal  of  entry  of  the  organism 
Shall  I  try  to  write  a  clause  to  account  for  [A]? 

**  Yes. 

Here  we  see  one  of  the  utilities  of  the  rule  model.  TEIRESIAS  prompts  the  expert  to  add 
a  bit  of  knowledge  that  is  typcially  found  in  rules  like  these  and  offers  to  fix  the  rule 
Itself.  This  completion  is  accomplished  by  looking  at  other  rules  that  fit  the  same  rule 
model,  to  find  the  most  likely  portal  of  entry  clause. 
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how  about  - 

[A]  The  portal  of  entry  Is  gastrointestinal.  Ok? 

**  Yes. 

TEIRESIAS  now  does  some  finishing  up:  checking  the  complete  rule  with  the  expert  for 
final  approval  and  asking  the  expert  to  write  a  brief  description  ( for  bookkeeping 
purposes )  of  why  the  rule  was  needed.  Finally,  it  reruns  the  consultation  internally,  using 
the  responses  from  Fig.  /,  which  it  has  stored.  It  turns  out  that  adding  the  rule  above  did, 
in  fact,  cure  the  other  problems  with  the  first  consultation,  and  this  time  the  diagnosis  is 
satisfactory  to  the  expert. 


Summary:  TEIRESIAS  and  Expert  Systems 

TEIRESIAS  aids  a  human  expert  in  monitoring  the  performance  of  a  knowledge-based 
system.  When  the  human  expert  spots  an  error  in  the  program's  performance,  either  in  the 
program's  conclusions  or  its  "line  of  reasoning,''  TEIRESIAS  assists  in  finding  the  source  of 
the  error  in  the  database  by  explaining  the  program's  conclusions--retracing  the  reasoning 
steps  until  the  faulty  (or  missing)  rule  is  identified.  At  this  point,  TEIRESIAS  assists  in 
knowledge  acquisition ,  modifying  faulty  rules  or  adding  new  rules  to  the  database.  Meta-level 
knowledge  about  the  kinds  of  rules  and  concepts  In  the  database  is  used  to  build 
expectations  in  TEIRESIAS's  model-based  understanding  process.  Meta-level  knowledge  Is  also 
used  to  encode  problem-solving  strategies,  in  particular,  to  order  the  invocation  of  rules  so 
that  those  that  are  most  likely  to  be  useful  (given  the  current  knowledge  of  the  program)  are 
tried  first. 


References 

The  principal  reference  on  TEIRESIAS  is  the  doctoral  dissertation  by  Davis  (1976). 
Uses  of  meta-knowledge  in  expert  systems  are  discussed  in  Davis  &  Buchanan  (1977).  Also 
see  Davis  (1977)  and  Davis  (1978). 
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C.  Medical  Systems 
Cl.  MYCIN 

The  MYCIN  system  (Shortllffe,  1976)  was  designed  to  provide  consultative  advice  on 
diagnosis  and  therapy  for  Infectious  diseases.  Such  advice  is  often  required  In  hospital  care 
because  the  attending  physician  may  not  be  an  expert  on  Infectious  diseases  that  show 
themselves  during  a  period  of  hospitalization— as,  for  example,  when  a  cardiology  patient 
develops  an  Infection  after  heart  surgery.  Time  considerations  compound  the  problem.  A 
specimen  (blood,  urine,  etc.)  from  a  patient  can  show  some  early  evidence  of  bacterial 
growth  within  12  hours,  but  generally  It  takes  24  to  48  hours  (or  more)  for  positive 
Identification.  The  physician  must  therefore  often  decide  In  the  absence  of  complete 
information  whether  or  not  to  start  treatment  and  what  drugs  to  use  If  treatment  is  required. 
Both  of  these  may  be  difficult  questions. 

The  trace  that  follows  shows  the  Initial  and  final  parts  of  a  sample  Interaction  between 
a  physician  and  MYCIN  (italicized  comments  provide  additional  commentary  but  are  not  part  of 
the  actual  dialogue).  MYCIN,  in  effect,  "Interviews"  the  doctor  about  his  patient,  collecting 
information  from  which  It  Infers  the  diagnosis  and  selects  an  appropriate  therapy. 

. PATIENT- 1 - 

1)  Patient's  name:  (first-last) 

**  FRED  SMITH 

MYCIN  collects  some  background  data  about  the  patient. 

2)  Sex: 

**  MALE 

3)  Age: 

**  66 

4)  Have  you  been  able  to  obtain  positive  cultures  from  a  site  at  which 
Fred  Smith  has  an  Infection? 

**  YES 

. INFECTION-1 . 

6)  What  Is  the  infection? 

**  PRIMARY-BACTEREMIA 

6)  Please  give  the  date  and  approximate  time  when  signs  or  symptoms  of 

the  primary-bacteremia  (INFECTION-1)  first  appeared,  (mo/da/yr) 

**  MAY  6,  1976 

The  most  recent  positive  culture  associated  with  the  primary-bacteremia 
(INFECTION-1 )  will  be  referred  to  as: 

. CULTURE- 1 - 

7)  From  what  site  was  the  specimen  for  CULTURE-1  taken? 

»*  BLOOD 

8)  Please  give  the  date  and  time  when  this  blood  culture  (CULTURE- 1) 
was  obtained,  (mo/da/yr) 

**  MAY  9,  1976 

The  first  significant  organism  from  this  blood  culture  (CULTURE- 1) 

will  be  referred  to  as: 

. ORGANISM-1 - 

9)  Enter  the  identity  of  ORGANISM- 1: 
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**  UNKNOWN 

Typically,  the  identity  of  the  organism  is  not  known  and  hence  must  be  inferred  by 
MYC/N,  using  various  other  pieces  of  laboratory  and  clinical  information.  MYCIN 
proceeds  to  ask  questions  that  could  help  to  determine  the  Identity  of  the  organism. 

10)  Is  ORGANISM- 1  a  rod  or  coccus  (etc.): 

**  ROD 

11)  The  gram  stain  of  ORGANISM- 1: 

**  GRAMNEG 

1 2)  Have  there  been  POSITIVE  cultures  from  sites  which  are  NOT 
associated  with  Infections  about  which  you  are  seeking 
advice? 

**  NO 

13)  Did  ORGANISM-1  grow  aerobically? 

**  YES  (.8) 

14)  Did  ORGANISM-1  grow  anaerobically? 

**  YES 

1 5)  What  is  the  suspected  portal  of  entry  of  ORGANISM- 1 
into  this  sterile  site? 

**  Gl  (.6) 


More  questions  follow,  to  gather  sufficient  information  to  infer  possible  identities  of  the 
infecting  organisms. 


My  therapy  recommendation  will  be  based  on  the  following  probable 
infection(s)  and  potential  causative  organism(s): 


INFECTION-1  is  PRIMARY-BACTEREMiA 


may  be  PSEUDOMONAS-AERUGINOSA 
may  be  KLEBSIELLA-PNEUMONIAE 
may  be  E.COLI 

may  be  BACTEROIDES-FRAGILIS 

may  be  ENTEROBACTER 

may  be  PROTEUS-NON-MIRABILIS 


<ltem  1  >  The  identity  of  ORGANISM- 1 
<ltem  2>  The  identity  of  ORGANISM-1 
<ltem  3>  The  Identity  of  ORGANISM-1 
<ltem  4>  The  identity  of  ORGANISM-1 
<ltem  6>  The  identity  of  ORGANISM-1 
<ltem  6>  The  Identity  of  ORGANISM-1 


46)  On  a  scale  of  0  to  4,  where  higher  numbers  indicate  increasing 
severity  how  would  you  rate  Fred  Smith 's  degree  of  sickness? 

**  3 

47)  Does  Fred  Smith  have  a  clinically  significant  allergic  reaction  to  any 
antimicrobial  agent? 

**  NO 

48)  Fred  Smith 's  weight  in  kilograms  (or  <numb er>  POUNDS): 

•**  70 


Note  that  2  drugs  cover  for  6  potential  organisms. 


Cl 
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[Rec  1  ]  My  preferred  therapy  recommendation  is  as  follows: 

In  order  to  cover  for  Items  <1  2  3  56): 

Give:  GENTAMICIN 

Dose:  1 19  mg  (1.7  mg/kg)  q8h  IV  [or  IM]  for  10  days 
Comments:  Modify  dose  in  renal  failure 
In  order  to  cover  for  Item  <4>: 

Give:  CLINDAMYCIN 

Dose:  695  mg  (8.6  mg/kg)  q6h  IV  [or  IM]  for  14  days 
Comments:  if  diarrhea  or  other  Gl  symptoms  develop,  patient 
should  be  evaluated  for  possible  pseudomembranous  colitis. 

Do  you  wish  to  see  the  next  choice  therapy? 

Alternative  choices  are  available. 

**  YES 

Which,  if  any,  of  the  above  drugs  do  you  wish  eliminated  from  consideration? 
**  CLINDAMYCIN 

[Rec  2]  Next  best  therapy  recommendation: 

In  order  to  cover  for  Items  <23  45  6>: 

Give:  CHLORAMPHENICOL 
Dose:  1  gm  (15  mg/kg)  q0h  IV  for  14  days 
Comments:  Monitor  patient  '  s  white  count 
In  order  to  cover  for  Item  <1>: 

Give:  GENTAMICIN 

Dose:  1 19  mg  (1.7  mg/kg)  q8h  IV  [or  IM]  for  10  days 
Comments:  Modify  dose  In  renal  failure 


Knowledge  Base  Organization 

The  medical  knowledge  used  by  MYCIN  Is  encoded  as  production  rules  (see  article 
RapreaentatioaBS)  of  the  sort  shown  In  Figure  1  below.  The  rules  are  stored  Internally  In  the 
INTERLISP  form  shown,  from  which  the  English  version  Is  generated.  Each  rule  is  a  single 
"chunk"  of  domain-specific  Information  Indicating  an  action  (in  this  case  a  conclusion)  that  is 
justified  if  the  conditions  specified  In  the  premise  are  fulfilled.  Since  the  rule  uses  a 
vocabulary  of  concepts  common  to  the  domain,  It  forms,  by  Itself,  a  comprehensible 
statement  of  some  piece  of  domain  knowledge.  As  will  become  clear,  this  characteristic  is 
useful  In  many  ways. 

Each  rule  is  highly  stylized— with  the  If/then  format  and  the  specified  set  of  available 
primitives.  While  the  LISP  form  of  each  is  executable  code  (the  premise,  in  fact,  is  simply 
EVALuated  by  LISP  to  test  Its  truth;  and  the  action,  EVALuated  to  make  Its  conclusions),  this 
tightly  structured  form  makes  It  possible  to  examine  the  rules  as  well  as  execute  them.  For 
example,  the  rules  can  be  translated  Into  a  readable  English  format  as  in  Figure  1.  This 
translation  capability  has  been  used  In  MYCIN  to  explain  the  program's  Inferences  to  the 
expert.  The  ability  to  explain  a  line  of  reasoning  leading  to  a  conclusion  and  to  Justify  why 
the  program  is  asking  a  particular  question  In  a  given  case  Is  important.  Physicians  are  more 
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likely  to  accept  the  recommendations  of  a  system  that  can  explain  its  rationale  for  making 
them.  This  ability  is  discussed  in  more  detail  in  Article  B  on  TEIRESIAS. 

RULE058 

.If  I)  the  Infection  is  primary-bacteremia,  and 

2)  the  site  of  the  culture  is  one  of  the  sterile  sites,  and 

3)  the  suspected  portal  of  entry  of  the  organism  is  the  gastro¬ 
intestinal  tract, 

then  there  is  suggestive  evidence  (.7)  that  the  identity  of  the  organism  is 
bacteroides. 

PREMISE:  (  AND  (SAME  CNTXT  INFECT  PRIMARY-BACTEREMIA) 

(MEMBF  CNTXT  SITE  STERILESITES) 

(SAME  CNTXT  PORTAL  GI)j 

ACTION:  (CONCLUDE  CNTXT  IDENT  BACTERdlDES  TALLY  .7) 

Figure  1.  MYCIN  production  rule. 

The  current  knowledge  base  contains  450  such  rules  that  enable  MYCIN  to  diagnose  and 
prescribe  therapy  for  bacteremia  (infections  of  the  blood)  and  meningitis. 

Note  that  the  rules  are  judgmental,  that  is,  they  make  inexact  inferences  on  a  confidence 
scale  of  -1.0  to  1.0,  where  -1.0  represents  complete  confidence  that  a  proposition  is  false 
and  1.0  represents  complete  confidence  it  is  true.  In  the  case  of  the  above  rule,  the 
evidence  cited  in  the  premise  Is  enough  to  assert  the  conclusion  shown  with  a  mild  degree  of 
confidence:  0.7.  This  number  is  called  the  "certainty  factor,"  or  CF,  and  embodies  a  model 
of  confirmation  described  by  Shortliffe  (1976).  MYCIN  uses  CFs  rather  than  other,  more 
standard  statistical  measures  to  decide  between  alternatives  during  a  consultation  session. 
Standard  statistical  measures  were  rejected  In  favor  of  CFs  because  experience  with 
clinicians  had  shown  that  they  do  not  use  the  information  comparable  to  implemented 
standard  statistical  methods.  However,  the  concept  of  CFs  did  seem  to  fit  the  clinicians' 
reasoning  patterns— their  judgments  of  how  they  weighted  factors,  strong  or  weak,  in 
decision  making. 

The  CFs  are  a  measurement  of  the  association  between  the  premise  and  action  clauses 
of  each  rule.  When  a  production  rule  succeeds  because  its  premise  clauses  are  true  in  the 
current  context,  the  CFs  of  the  component  clauses  that  indicate  how  strongly  each  clause  is 
believed  are  combined,  and  the  resulting  CF  Is  used  to  modify  the  CF  specified  in  the  action 
clauses.  Thus,  if  the  premise  was  only  weakly  believed  (low,  positive  total  CF)  then  any 
conclusions  that  the  rule  might  make  would  be  modified  (reduced)  to  reflect  this  weak  belief 
that  the  patient  was  In  a  particular  situation.  In  questions  13  and  15  in  the  transcript  above, 
the  user  shows  ack  of  complete  confidence.  In  addition,  since  the  conclusion  of  one  rule  may 
be  the  premise  of  another,  reasoning  from  premises  with  less-than-complete  confidence 
factors  Is  commonplace. 

The  premise  of  each  rule  Is  a  Boolean  combination  of  one  or  more  clauses,  each  of  which 
is  constructed  from  a  predicate  function  with  an  associative  triple  (attribute,  object,  value)  as 
Its  argument  Thus,  each  premise  clause  typically  has  the  following  four  components: 

<predicate  function)  <object>  <attribute>  <  value) 


Cl 


MYCIN 


23 


For  example,  the  second  clause  In  ruleOSO,  above,  Is: 

The  site  of  the  culture  is  one  of  the  sterile  sites 

or,  In  INTERLISP: 

(MEMBF  CNTXT  SITE  STERILESITES) 

III  I 

Predicate  Object  Attribute  Value 

MEMBF  Is  a  predicate,  and  the  triple  says  that  the  site  of  the  current  object  (an  organism,  in 
this  case)  Is  a  member  of  the  class  of  sterile  sites.  There  is  a  standardized  set  of  some  Pd 
domain-independent  predicate  functions  (e.g.,  SAME,  KNOWN,  DEFINITE)  and  a  range  of 
domain-specific  attributes  (e.g.,  IDENTITY,  SITE),  objects  (e.g.,  ORGANISM,  CULTURE),  and 
associated  values  (e.g.,  E.COLI,  BLOOD).  These  form  the  "vocabulary"  of  conceptual 
primitives  available  for  use  when  constructing  rules. 

A  rule  premise  is  always  a  conjunction  of  clauses,  but  it  may  contain  arbitrarily  complex 
conjunctions  or  disjunctions  nested  within  each  clause,  (instead  of  writing  rules  whose 
premise  would  be  a  disjunction  of  clauses,  a  separate  rule  is  written  for  each  clause.)  The 
action  part  Indicates  one  or  more  conclusions  that  can  be  drawn  If  the  premises  are 
satisfied,  making  the  rules  purely  inferential. 

Medical  facts  about  the  patient  are  represented  as  4-tuples  made  up  of  an  assorlntivr 
triple  and  Its  current  CF  (see  Figure  3  below).  Positive  CFs  indicate  that  the  evidem  c 
confirms  the  hypothesis;  negative  CFs  Indicate  dlsconfirmlng  evidence. 

(IDENT  ORGANISM-2  KLEBSIELLA  .26) 

(IDENT  ORGANISM-2  E.COLI  .73) 

(SENSITIVS  ORGANISM-1  PENICILLIN  -1.0) 

(IMMUNOSUPPRESSED  PATIENT-1  YES  1.0) 

Figure  3.  MYCIN  4-tuple. 

MYCIN's  model  of  Inexact  reasoning  permits  the  coexistence  of  several  plausible  values  for  a 
single  attribute,  If  this  Is  suggested  by  the  evidence.  For  example,  after  attempting  to 
deduce  the  identity  (IDENT)  of  an  organism,  MYCIN  may  have  concluded  (correctly)  that  there 
is  evidence  of  both  E.coll  and  Klebsiella. 

To  summarize,  there  are  two  major  forms  of  knowledge  representation  In  use  in  the 
performance  program:  (a)  the  attributes,  objects,  and  values— which  form  a  vocabulary  of 
domain-specific  conceptual  primitives,  and  (b)  the  inference  rules  expressed  in  terms  of 
these  primitives. 


The  Inference  Engine 

In  MYCIN,  rules  are  Invoked  In  a  simple  backuiard-chainlng  fashion  that  produces  an 
exhaustive  depth-first  search  of  an  AND/OR  goal  tree  (see  article  Seerch.ProblBm.RBduction). 
For  example,  assume  that  the  program  Is  attempting  to  determine  the  Identity  of  an  infecting 
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organism.  It  retrieves  all  the  rules  that  make  a  conclusion  about  the  topic  (l.e.,  that  mention 
the  identity  of  bacteria  In  their  action)  and  invokes  each  one  in  turn,  evaluating  each  premise 
to  see  If  the  conditions  specified  have  been  met.  For  the  sample  rule  above,  this  process 
would  begin  with  determining  the  type  of  infection.  Since  the  type  of  the  Infection  is 
unknown,  it  is  set  up  as  a  subgoal  and  the  process  recurs. 

The  search  Is  thus  depth-first  (because  each  premise  condition  is  thoroughly  explored 
in  turn);  the  tree  that  Is  sprouted  Is  an  and/or  goal  tree  (because  rules  may  have  OR 
conditions  in  their  premise);  and  the  search  Is  exhaustive  (because  the  rules  are  inexact;  so 
that  even  if  one  succeeds,  it  was  deemed  a  wisely  conservative  strategy  to  continue  to 
collect  all  evidence  about  the  subgoal. 

The  subgoal  that  is  set  up  is  a  generalized  form  of  the  original  goal.  Thus,  for  the  first 
clause  in  the  rule  ("the  Infection  is  primary-bacteremia"),  the  subgoal  set  up  is  "determine 
the  type  of  infection."  The  subgoal  is  therefore  always  of  the  form  "determine  the  value  of 
attribute"  rather  than  "determine  whether  the  attribute  is  equal  to  <value>."  By  setting  up  the 
generalized  goal  of  collecting  all  evidence  about  an  attribute,  the  performance  program 
effectively  exhausts  each  subject  as  it  is  encountered,  and  thus  tends  to  group  together  all 
questions  about  a  given  topic.  This  feature  results  in  a  system  that  displays  a  much  more 
focused,  methodical  approach  to  the  task,  which  is  a  distinct  advantage  where  human 
engineering  considerations  are  important.  The  cost  Is  the  effort  of  deducing  or  collecting 
information  that  is  not  strictly  necessary.  However,  since  this  unnecessary  effort  occurs 
rarely--only  when  the  <attribute>  can  be  deduced  with  certainty  to  be  the  <value>  named  in 
the  original  goal— it  has  not  proven  to  be  a  problem  in  practice. 

If  after  trying  all  relevant  rules  to  resolve  a  subgoal,  the  total  weight  of  the  evidence 
about  a  hypothesis  falls  between  -.2  and  .2  (an  empirical  threshold),  the  answer  is  regarded 
as  still  unknown.  This  result  would  occur  if  no  rules  were  applicable,  if  the  applicable  rules 
were  too  weak,  if  the  effects  of  several  rules  offset  each  other,  or  if  there  were  no  rules  for 
this  subgoal  at  all.  In  any  of  these  cases,  when  the  system  is  unable  to  deduce  the  answer, 
it  asks  the  user  for  the  value  of  the  subgoal  (using  a  phrase  that  is  stored  along  with  the 
attribute  itself). 

This  strategy,  of  always  attempting  to  deduce  the  value  of  a  subgoal  and  asking  the 
user  only  when  deduction  fails,  insures  a  minimum  number  of  questions.  It  could  also  mean, 
however,  that  work  might  be  expended  searching  for  a  subgoal,  arriving  perhaps  at  a  lessi 
than  definite  answer  when  the  user  might  already  know  the  answer  with  certainty.  To 
prevent  this  inefficiency,  some  of  the  attributes  have  been  labeled  "laboratory  data,"  to 
Indicate  that  they  represent  Information  available  to  the  program  as  results  of  quantitative 
tests.  In  these  cases  the  deduce-then-ask  procedure  Is  reversed,  and  the  system  will 
attempt  to  deduce  the  answer  only  if  the  user  cannot  supply  it.  Given  the  desire  to  minimize 
both  tree  search  and  the  number  of  questions  asked,  there  Is  no  guaranteed  optimal  solution 
to  the  problem  of  deciding  when  to  ask  for  information  and  when  to  try  to  deduce  it.  But  the 
distinction  described  has  performed  quite  well  and  seems  to  embody  an  appropriate  criterion. 

Two  other  additions  to  straightforward  tree  search  Increase  the  inference  engine's 
efficiency.  First,  before  the  entire  list  of  rules  for  a  subgoal  Is  retrieved,  the  proqrnm 
attempts  to  find  a  sequence  of  rules  that  would  establish  the  goal  with  certainty,  based  only 
on  what  is  currently  known.  Since  this  Is  a  search  for  a  sequence  of  rules  with  CF*  1,  the 
result  is  termed  a  unity  path.  Besides  efficiency  considerations,  this  process  offers  the 
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advantage  of  allowing  the  program  to  make  "commonsense"  deductions  with  a  minimum  of 
effort  (rules  with  CF  -  1  are  largely  definitional).  Because  there  are  few  such  rules  in  the 
system,  the  search  is  typically  very  brief. 

Second,  the  inference  engine  performs  a  partial  evaluation  of  lule  premises.  Since 
many  attributes  are  found  in  several  rules,  the  value  of  one  clause  (perhaps  the  last)  in  a 
premise  may  already  have  been  established  while  the  rest  are  still  unknown.  If  this  clause 
alone  would  make  the  premise  false,  there  is  clearly  no  reason  to  do  all  the  search 
necessary  to  establish  the  others.  Each  premise  is  thus  "previewed"  by  evaluating  it  on  the 
basis  of  currently  available  information.  The  result  is  a  Boolean  combination  of  TRUEs, 
FALSEs,  and  UNKNOWNS;  and  straightforward  simplification  (e.g.,  F  x  U  *  F)  Indicates  whether 
the  rule  Is  guaranteed  to  fail. 


Therapy  Selection 

After  MYCIN  determines  the  significant  Infections  and  the  organisms  that  cause  them,  it 
proceeds  to  recommend  an  antimicrobial  regimen  If  this  Is  appropriate.  The  MYCIN  therapy 
selector  (Clancey,  1979)  uses  a  description  of  the  patient's  Infections,  causal  organisms,  a 
ranking  of  drugs  by  sensitivity,  and  a  set  of  drug  preference  categories  (such  as  "propose  2 
drugs:  one  second  choice  drug  and  one  third  choice  drug")  to  recommend  a  drug  regimen.  The 
algorithm  will  also  modify  dosages  in  the  case  of  renal  failure  In  the  patient.  The  program  can 
provide  detailed  explanations  about  how  it  made  a  regimen  choice  and  can  accept  and 
critique  a  regimen  proposed  by  the  physician. 


Acquisition  and  Use  of  New  Knowledge 

The  representation  of  knowledge  as  production  rules  and  the  ability  to  explain  specific 
rules  allow  MYCIN  to  interact  with  an  expert  clinician  in  a  manner  that  permits  the  system  to 
acquire  and  use  new  knowledge.  The  TEIRESIAS  system  (see  article  B;  also  Oavis,  1976) 
works  In  conjunction  with  MYCIN  and  allows  the  expert  to  Inspect  faulty  reasoning  chains  and 
then  add  and  modify  any  rules  or  clinical  parameters  required  to  augment  and  repair  the 
medical  knowledge  of  MYCIN. 

When  the  expert  Is  dissatisfied  with  the  system's  performance  on  a  particular  case, 
MYCIN  is  able  to  explain  how  It  made  the  erroneous  conclusions  and  to  guide  the  expert 
while  he  Is  determining  the  source  of  the  reasoning  "bug."  To  correct  the  reasoning,  the 
expert  may  elect  to  enter  new  rules  or  alter  existing  ones.  The  user  enters  his  requests 
through  what  Is  nearly  a  natural  language  interface.  These  requests  are  parsed  and  used  by 
the  system  to  create  a  new  internal  rule  that  Is  then  presented  to  the  user  for  inspection. 
This  Interaction  helps  minimize  any  misunderstanding  between  the  clinician  and  MYCIN. 

Once  this  new  rule  is  accepted  and  understood  by  the  system,  the  next  consultation 
will  make  use  of  it  and  alter  Its  recommendations  accordingly.  This  ability  permits  the  system 
to  interact  directly  with  the  domain  experts  without  intervention  of  a  programmer. 
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Concluding  Remarks 

Formal  evaluations  of  the  MYCIN  system  have  been  done  that  indicate  that  MYCIN 
compares  favorably  with  infectious  disease  experts  In  diagnosing  and  selecting  therapy  for 
patients  with  bacteremia  and  meningitis.  At  present,  however,  the  system  Is  not  used  on  the 
wards,  primarily  due  to  Its  Incomplete  knowledge  of  the  full  spectrum  of  infectious  diseases. 

MYCIN  is  one  of  the  first  of  a  new  breed  of  computer  systems:  systems  that  step  out 
of  the  toy  worlds  of  Al  into  the  real  world.  These  systems  must  deal  with  many  of  the  social 
and  psychological  problems  of  man/machine  interactions.  Issues  such  as  modularity  and 
representation  of  knowledge,  reasoning  in  specific  domains,  explanation  of  a  system's  logic, 
and  the  ability  to  accumulate  and  use  new  information  must  be  considered  with  attention  to 
both  programming  and  interfacing  problems.  MYCIN  has  been  designed  with  these  Issues  in 
mind  and  has  consequently  shown  promise  as  a  real-world  aid  to  the  clinician. 
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The  Causal  Associational  NETwork  (CASNET)  program  (Weiss,  Kulikowski,  &  Safir,  1977) 
is  a  computer  system  for  performing  medical  diagnosis  developed  at  Rutgers  University.  The 
major  application  of  CASNET  has  been  in  the  domain  of  glaucoma.  The  system  represents  a 
disease  not  as  a  static  "state"  but  as  a  dynamic  process  that  can  be  modeled  as  a  network  of 
causally  linked  pathophysiological  states.  The  system  diagnoses  a  patient  by  determining  the 
pattern  of  pathophysiological  causal  pathways  present  in  the  patient  and  identifying  this 
pattern  with  a  disease  category.  Once  the  disease  category  is  explicitly  identified,  the  most 
appropriate  treatments  can  be  prescribed.  The  causal  model  also  makes  possible  a 
prediction  of  the  likely  future  course  of  a  disease  both  if  treated  and  if  untreated. 
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Fio.  1.  Three-level  description  of  a  disease  process, 
From  Weiss,  1978. 
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Representation  of  Medical  Knowledge 

A  CASNET  model  consists  of  three  "planes  of  knowledge",  parts  of  which  are  shown  in 
Figure  1 .  The  plane  of  pathophysiological  states  is  the  heart  of  the  model.  The  nodes  in  this 
plane  represent  elementary  hypotheses  about  the  disease  process,  and  arcs  here  represent 
a  causal  connection  between  two  elementary  hypotheses;  for  example,  INCREASED 
INTRAOCULAR  PRESSURE. ...CAUSES....CUPPING  OF  THE  OPTIC  DISK.  Associated  with  each 
link  Is  a  forward  weight  or  confidence  factor,  a  number  on  a  1-5  scale,  where  5  corresponds  to 
"(almost)  always  causes"  and  1  to  "rarely  causes."  The  determination  of  these  weights  and 
their  utility  In  confirming  or  disconfirmlng  the  presence  of  a  pathophysiological  state  are 
discussed  later  in  this  article. 

The  plane  of  observations  contains  nodes  representing  evidence  gathered  from  the 
patient.  These  include  signs,  symptoms,  and  laboratory  tests.  During  a  consultation,  some  or 
all  of  these  nodes  will  be  instantiated.  Nodes  in  this  plane  are  linked  to  nodes  in  the 
pathophysiological  plane.  The  links  have  associated  confidences,  again  on  a  1-5  scale, 
reflecting  the  degree  to  which  the  particular  test,  symptom,  or  sign  supports  the  associated 
state.  For  example,  a  scotoma  (a  perimetry  measurement)  strongly  indicates  VISUAL  FIELD 
LOSS,  so  it  has  a  confidence  value  of  6.  The  same  test,  however,  could  have  a  different 
confidence  value  depending  on  the  results;  for  example,  15  mm  of  Hg  could  be  considered 
evidence  for  INCREASED  INTRAOCULAR  PRESSURE,  but  a  result  of  30  would  be  definite 
evidence  and  would  carry  a  greater  confidence  value.  The  confidence  values  with  which 
observations  are  linked  to  pathophysiological  states  are  predetermined  by  the  designers  of 
CASNET. 

In  general,  there  is  usually  more  than  one  test  for  a  particular  state,  and  the  same  test 
indicates  more  than  one  state.  Each  test  also  has  an  associated  cost  that  reflects  both 
monetary  cost  and  danger  to  the  patient.  Some  states  might  not  have  a  corresponding  test 
since  such  a  test  might  not  exist  or  might  be  judged  too  difficult  or  costly  to  use  for  a 
particular  pathology. 

The  third  plane  contains  the  disease  classification  tables.  A  classification  table 
defines  a  "disease"  as  a  set  of  confirmed  and  denied  pathophysiological  states.  It  also 
contains  a  set  of  treatment  statements  for  that  disease. 


STATES 

DISEASES 

TREATMENTS 

ANGLE  CLOSURE 

INCR  I0P 

ANGLE-CLOSURE-GLAUCOMA 

TREATMNTI . . . 

CUPPING 

VFL 

CHRONIC-ANGLE-CLOSURE  GLAUCOMA 

TR1,  TR2... 

Figure  2.  A  classification  table. 

For  example,  the  classification  table  in  Figure  2  indicates  that  if  a  patient  is  found  to  have 
ANGLE  CLOSURE  and  INCREASED  INTRAOCULAR  PRESSURE  but  neither  CUPPING  nor  VISUAL 
FIELD  LOSS,  then  he  has  ANGLE  CLOSURE  GLAUCOMA;  if  he  has  ANGLE  CLOSURE  and 
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INCREASED  INTRAOCULAR  PRESSURE  and  CUPPING  and  VISUAL  FIELD  LOSS,  then  he  has 
CHRONIC  ANGLE  CLOSURE  GLAUCOMA.  The  concept  represented  In  the  classification  tables 
Is  that  a  disease  Is  dynamic  with  respect  to  time  and  that  confirmed  states  further  down  a 
pathway  represent  more  advanced  stages  of  the  disease.  The  states  in  a  classification  table 
will  generally  be  on  the  same  pathway,  A  "starting  state"  is  a  state  with  no  causes  In  the 
network  (also  called  a  basic  disease  mechanism),  inadequate  understanding  of  disease 
mechanisms  or  incomplete  models  sometimes  lead  to  classification  tables  containing  states 
from  more  than  one  pathway. 


Reasoning 

Figure  2  illustrates  how  CASNET  defines  a  disease  as  a  conjunction  of  causally  related 
pathophysiological  states.  Diagnosis  In  CASNET  is  a  matter  of  finding  one  or  more  causal 
pathways  between  these  states.  Reasoning  in  CASNET  is  designed  to  maximize  the  likelihood 
of  finding  these  pathways,  given  a  set  of  signs,  symptoms,  and  test  results. 

A  diagnostic  session  begins  with  the  program's  asking  the  user  (physician)  a  series  of 
questions  about  the  patient.  The  physican  answers  with  values  for  any  tests,  signs,  and 
symptoms,  or  he  answers  UNKNOWN.  These  values,  together  with  the  confidences  associated 
with  the  tests  and  the  weights  associated  with  the  causal  arcs,  are  used  to  compute  a 
status,  or  confidence  factor,  for  each  node  in  the  causal  net. 

The  STATUS  of  a  state  is  affected  both  by  the  results  of  its  associated  tests  and  by 
the  STATUSs  of  the  states  around  It.  For  example,  if  A  causes  B  and  B  is  confirmed  by 
observation,  then  there  is  strong  evidence  for  A.  A  general  algorithm  Is  used  to  propagate 
these  weights  on  a  state,  both  In  the  forward  direction  (i.e.,  along  the  direction  of  the  causal 
link)  and  in  the  backward  direction.  A  state  Is  marked  confirmed  if  its  STATUS  is  greater  than 
a  preset  threshold,  It  is  marked  denied  if  Its  STATUS  Is  less  than  a  second  threshold, 
otherwise  It  is  undetermined.  The  program  uses  a  strategy  for  selecting  the  next  question, 
based  on  the  cost  of  the  test  and  on  the  likelihood  that  It  will  lead  to  the  confirmation  or 
denial  of  a  state. 

After  all  available  symptoms  and  findings  have  been  entered  and  after  the  STATUS'S 
have  been  computed,  the  classification  tables  are  used  to  determine  diagnoses  and 
treatments.  The  tables  are  selected  to  cover  all  confirmed  nodes.  The  strategy  for 
selecting  the  tables  Is  to  find  the  starting  states  for  which  causal  pathways  can  be 
generated  that  reach  the  largest  number  of  confirmed  states  without  traversing  a  denied 
state.  This  procedure  is  repeated  until  all  of  the  confirmed  states  are  covered. 

The  treatment  statements  of  the  selected  classification  tables  are  then  used  to  select 
a  therapy  for  the  Indicated  diseases.  Like  a  state,  a  treatment  hss  an  associated  STATUS 
that  Is  interpreted  as  Its  confidence  in  its  success  as  a  treatment.  The  treatment  with  the 
highest  STATUS  Is  selected.  This  assessment  Is  repeated  for  all  selected  classification 
tables.  A  final  algorithm  decides  whether  some  treatments  are  subsumed  by  others,  and  then 
the  final  treatment  recommendations  are  printed.  If  desired,  a  short  summary  of  research 
Justifying  the  diagnosis  and  treatment  can  also  be  printed.  The  current  glaucoma  model 
contains  about  160  states,  360  tests,  and  60  classification  tables. 
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Concluding  Remarks 

CASNET  adopts  a  strictly  "bottom-up"  approach  to  the  problem  of  diagnosis,  working 
from  the  tests,  through  the  causal  pathways,  to  a  diagnosis.  The  separation  of  medical 
knowledge  (encoded  In  the  causal  network)  from  reasoning  strategies  (embodied  in  the 
program)  will  make  the  expansion  of  the  disease  model,  when  new  research  discoveries  are 
made,  a  simple  matter.  The  program  is  continually  being  tested  and  updated  by  a  computer- 
based  network  of  collaborators.  The  model  also  provides  a  convenient  way  of  following  the 
progress  of  a  patient's  disease  over  multiple  visits— the  causal  net  can  be  used  to  view  the 
disease  progression,  both  forwards  and  backwards,  along  the  pathways.  Although  CASNET 
has  been  used  primarily  In  the  area  of  glaucoma,  the  representational  scheme  and  decision¬ 
making  procedures  are  applicable  to  other  disease  areas  that  are  understood  well  enough  to 
m«*<e  the  process  of  disease  known.  The  program's  performance  has  been  evaluated  by 
opthalmologists  and  Is  considered  close  to  expert  level. 
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C3.  INTERNIST 

INTERNIST  (Pople,  1975;  Pople,  1977)  Is  a  medical  consultation  program  In  the  domain 
of  internal  medicine  developed  jointly  by  H.  Pople,  a  computer  scientist,  and  J.  Myers,  a 
specialist  in  internal  medicine,  both  at  the  University  of  Pittsburgh.  The  program  is  presented 
with  a  list  of  manifestations  of  disease  in  a  patient  (e.g.,  symptoms,  physical  signs,  laboratory 
data,  and  history),  and  it  attempts  to  form  a  diagnosis.  The  diagnosis  consists  of  a  list  of 
diseases  that  would  account  for  the  manifestations.  Using  Information  presented  during  the 
course  of  the  consultation,  the  program  is  able  to  discriminate  between  competing  disease 
hypotheses.  The  current  version  of  the  program  only  formulates  diagnoses  and  does  not 
recommend  treatments. 

One  of  the  major  goals  of  the  INTERNIST  project  has  been  to  model  the  way  clinicians 
do  diagnostic  reasoning.  The  program  has  been  used  to  explore  the  way  that  certain 
symptoms  evoke  particular  diseases  in  the  mind  of  the  clinician:  how  hypothesized  diseases 
generate  expectations  of  other  symptoms,  how  a  clinician  focuses  on  a  particular  disease 
area  and  temporarily  ignores  certain  other  symptoms  that  he  judges  irrelevant,  and  how  he 
decides  between  competing  disease  hypotheses. 

From  the  standpoint  of  computer  science,  INTERNIST  Is  solving  a  theory  formation  or 
hypothesis  formation  problem.  Determining  a  satisfactory  diagnosis  involves  inferring  a  set  of 
hypotheses  to  explain  the  patient  data.  In  INTERNIST,  the  data  are  manifestations  and  the 
hypotheses  are  diseases. 

Diagnosis  in  internal  medicine  is  complicated  because  a  patient  may  suffer  from  a 
number  of  diseases  simultaneously.  Although  some  diseases  are  more  likely  to  be  associated 
than  others,  the  possible  combinations  are  too  numerous  to  encode  a  priori.  Pople  (1977) 
suggests  that  a  conservative  estimate  of  this  number  is  10  to  the  40th.  Clearly,  diagnosis  of 
a  set  of  diseases  present  in  a  patient  is  nontrivial.  INTERNIST-1  accomplishes  this  diagnosis 
by  sequentially  establishing  the  diseases  that  best  fit  the  data.  INTERNIST-II  is  an 
improvement  over  its  predecessor  because  it  establishes  the  set  of  diseases  in  parallel  and 
therefore  avoids  some  of  the  annoying  artifacts  of  sequential  processing,  such  as 
considering  a  number  of  Incorrect  diagnoses  before  "focusing  In"  on  the  correct  one. 


Overview  of  INTERNIST-1 

For  INTERNIST-1  a  problem  is  defined  as  a  set  of  mutually  exclusive  disease  hypotheses. 
If  a  patient  has  a  number  of  diseases,  INTERNIST-1  must  solve  that  number  of  problems.  In 
brief,  INTERNIST-1  finds  a  set  of  diseases  that  account  for  some  or  all  of  a  set  of  symptoms, 
then  it  picks  one  disease  from  the  set  on  the  basis  of  a  scoring  schema,  which  is  the  solution 
for  one  of  the  problems.  Then  it  finds  another  set  of  diseases  that  account  for  some  or  all  of 
any  remaining  symptoms  and  again  picks  the  most  likely  of  these  alternatives.  It  continues  in 
this  manner  until  all  symptoms  have  been  accounted  for. 


Representation  of  Medical  Knowledge 

INTERNIST'S  knowledge  of  diseases  is  organized  into  a  disease  tree,  or  taxonomy,  using 
the  "form-of"  .elation  (see  Fig.  1).  For  example,  Hepatocellular  disease  is  a  form  of  liver 
disease. 
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Figure  1.  The  structure  of  the  disease  tree. 


The  top-level  classification  In  this  tree  is  by  organs—heart  disease,  lung  disease,  liver 
disease,  etc.  A  disease  node's  offspring  are  refinements  of  that  disease,  terminal  nodes  being 
individual  diseases.  A  nonterminal  node  and  its  subtree  are  referred  to  as  a  disease  area, 
while  a  terminal  node  is  referred  to  as  a  disease  entity.  The  disease  hierarchy  is 
predetermined  and  fixed  in  the  system. 


Diseases  and  their  manifestations  are  related  in  two  major  ways:  (a)  a  manifestation 
can  evoke  a  disease  and  (b)  a  disease  can  manifest  certain  signs  and  symptoms.  These 
relations  can  loosely  be  thought  of  as  probabilities:  p(D\M)  (the  conditional  probability  of  0 
given  M)  and  p(M\D),  respectively.  The  strength  of  these  relations  Is  given  by  a  number  on 
a  0-6  scale,  where  6  means  that  the  manifestation  Is  always  associated  with  the  disease 
and  0  means  that  no  conclusions  can  be  drawn  about  the  disease  and  the  manifestation. 
Each  disease  in  the  tree  is  associated  with  its  relevant  manifestations.  Several  other  kinds 
of  relationships  are  superimposed  on  the  disease  tree  to  capture  causal,  temporal,  and  other 
association  patterns  among  diseases. 


The  disease  tree  and  its  associated  manifestations  are  constructed  and  maintained 
separately  from  the  normal  diagnosis  program.  All  known  evokes  and  manifest  relations  are 
entered  for  the  terminal  nodes  (diseases)  of  the  tree.  A  list  of  manifestations  is  then 
computed  for  each  nonterminal  of  the  tree  by  taking  the  intersection  of  the  manifestation 
lists  of  that  node's  offspring.  In  this  way,  the  manifestations  "percolate"  up  through  the  tree 
to  the  most  general  disease  with  which  they  are  associated  and  are  stored  only  with  this 
node.  This  means  that  manifestations  associated  with  a  nonterminal  disease  node  are,  by 
implication,  also  associated  with  every  node  (terminal  or  non  terminal)  beneath  It  In  the  tree. 
As  well  as  providing  storage  economy,  this  information  is  used  during  the  consultation  for 
selecting  disease  areas  on  which  to  focus.  For  example,  jaundice  (yellowing  of  the  skin)  will 
be  associated  with  some  nonterminal  disease  (e.g.,  hepatitis)  under  liver  diseases,  and  its 
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presence  In  a  patient  will  cause  the  consultation  program  to  investigate  diseases  in  that 
disease  area. 

Various  properties  are  associated  with  each  manifestation.  The  most  important  ones  are 
TYPE  and  IMPORT.  TYPE  Is  a  measure  of  how  expensive  It  Is  to  test  for  a  manifestation,  both 
in  terms  of  financial  cost  and  physical  risk  to  the  patient.  TYPE  is  used  to  order  the  questions 
asked  by  the  consultation  program:  questions  about  less  expensive  manifestations  are 
asked  first.  The  IMPORT  of  a  manifestation  is  a  measure  of  how  easily  it  can  be  ignored  in  a 
diagnosis.  The  manifestation  "Shellfish  ingestion"  can  easily  be  Ignored,  but  a  liver  biopsy 
showing  caseating  granulomas  must  be  explained. 


Reasoning 

At  the  beginning  of  a  consultation,  a  list  of  manifestations  is  entered.  As  each 
manifestation  is  entered,  it  evokes  one  or  more  nodes  of  the  disease  tree.  A  model  is 
created  for  each  evoked  disease  node.  The  model  consists  of  4  lists:. 

--  Observed  manifestations  that  this  disease  cannot  explain.  (This  list  is  called 
the  shelf.) 

—  Observed  manifestations  that  are  consistent  with  the  disease. 

--  Manifestations  that  should  be  present  If  this  disease  is  the  correct  diagnosis 
but  that  have  not  been  observed  in  the  patient. 

—  Manifestations  consistent  with  this  disease  but  that  have  not  yet  been 
observed  In  the  patient. 

After  the  Initial  entry  of  manifestations,  the  disease  tree  consists  of  nodes  that  have  been 
"lit  up"  (evoked)  and  those  that  have  not.  A  diagnosis  corresponds  to  a  set  of  lit  terminal 
nodes  that  account  for  all  of  the  symptoms.  In  general,  at  this  stage  very  few  of  the 
terminal  nodes  will  be  lit  up,  so  the  program  must  ask  for  further  information.  To  get  this 
further  information,  the  program  will  focus  on  a  disease  area  and  formulate  a  problem. 

Each  disease  model  is  scored,  receiving  a  positive  score  for  each  manifestation  it 
explains  and  a  negative  score  for  each  manifestation  that  It  cannot  explain.  Both  are 
weighted  by  IMPORT.  It  receives  a  bonus  if  It  is  linked  causally  to  a  disease  that  has 
already  been  confirmed.  The  disease  models  are  partitioned  Into  two  sets:  (a)  the  top- 
ranked  model  and  the  diseases  that  are  mutually  exclusive  to  it  (alternatives),  and  (b)  the 
diseases  that  are  complementary  to  the  top-ranked  model.  For  example,  if  the  top  renv  oi 
node  is  hepatocellular  Injury,  then  other  evoked  liver  diseases  will  be  alternatives  to  it.  whit** 
lung  or  heart  diseases  will  be  complementary. 

Having  formulated  a  problem  by  partitioning  the  disease  models,  the  system  follows  one 
of  several  strategies,  depending  on  the  number  of  candidate  diseases  in  the  problem  set.  If 
there  are  many  (more  than  4)  alternative  hypotheses,  it  attempts  to  rule  out  as  many  as 
possible.  Questions  about  manifestations  that  strongly  indicate  a  disease  (high  (p(M\D))  are 
selected  first.  If  these  manifestations  are  not  present,  then  this  disease  can  be  ruled  out. 
If  there  are  between  2  and  4  possibilities,  the  program  attempts  to  discriminate  between 
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them.  Then  questions  about  manifestations  that  strongly  indicate  one  disease,  D1  (high 
(p(M\D1)))  ,and  weakly  indicate  another  disease,  D2  (low  (p(M\D2))),  are  selected.  These 
questions  are  able  to  discriminate  between  the  two  diseases.  If  there  is  only  one  candidate, 
then  questions  that  have  a  good  chance  of  confirming  this  disease  are  asked.  Sometimes,  if 
there  is  not  enough  data,  it  will  not  be  possible  to  confirm  one  of  the  terminal  nodes,  and  a 
more  general  diagnosis  is  given  (e.g.,  "liver  disease"). 

After  a  disease  is  confirmed,  its  manifestations  are  marked  "accounted  for";  bonus 
scores  are  given  to  (previously  manifested)  diseases  that  aro  causally  linked  to  this  one; 
and  focus  shifts  to  the  new  top-ranked  disease  and  the  formulation  of  a  new  problem. 


INTERNIST-II 

There  was  a  major  problem  with  INTERNIST-1.  In  complex  cases  the  program  had  a 
tendency  to  begin  the  analysis  by  focusing  first  on  totally  inappropriate  areas.  While  the 
final  diagnosis  was  usually  correct,  the  initial  meandering  was  annoying  to  clinicians.  The 
cause  of  the  problem  was  traced  to  the  sequential  method  of  problem  formulation.  The 
simultaneous  formulation  of  several  problems  is  being  investigated  in  INTERNIST-II. 

Representation  of  Medical  Knowledge.  INTERNIST-II  uses  the  same  database  as 
INTERNIST-1,  but  it  is  augmented  by  a  set  of  constrictor  relations.  These  are  manifestations  that 
do  not  evoke  a  particular  disease  but,  rather,  a  general  area  of  infirmity.  For  example, 
jaundice  alerts  clinicians  to  the  presence  of  liver  disease.  It  does  not  discriminate  between 
liver  diseases,  but  it  does  delimit  this  disease  area.  Formally,  a  disease  area  constrained  by 
a  constrictor  manifestation  is  a  subtree  of  the  disease  tree,  in  this  case  the  subtree  of  liver 
diseases. 

Reasoning.  A  problem  for  INTERNIST-1  is  to  find  a  set  of  terminal  nodes  on  the  disease 
tree  that  accounts  for  a  set  of  manifestations.  It  then  chooses  one  node  from  the  set  and 
formulates  another  problem.  INTERNIST-II  does  not  start  a  diagnosis  by  formulating  a  set  of 
terminal  nodes,  because  the  number  of  combinations  of  terminal  disease  nodes  that  may 
account  for  a  set  of  manifestations  is  enormous.  Instead,  INTERNIST-II  partitions  the  disease 
tree  into  disease  areas,  which  collectively  account  for  all  the  manifestations.  Constrictor 
manifestations  are  used  to  make  the  partitions.  If  a  patient  manifests  more  than  onr 
constrictor,  then  the  disease  tree  will  be  partitioned  into  more  than  one  disease  area.  The 
conjunction  of  all  the  disease  areas  is  called  the  root  structure  and  is  formally  a  set  of 
subtrees  of  the  disease  tree.  A  root  structure  accounts  for  all  the  patient’s  manifestations. 
The  problem  for  INTERNIST-II  is  to  decide  which  terminal  nodes  (actual  diseases)  within  the 
root  structure  best  account  for  the  manifestations.  This  objective  Is  accomplished  by 
partitioning  the  root  structure  into  smaller  subtrees  In  exactly  the  same  way  that  the 
disease  tree  was  partitioned  Into  the  root  structure,  namely,  by  using  manifestations  that 
strongly  suggest  a  disease  area,  (only  this  time,  the  disease  area  is  smaller).  The  process  of 
partitioning  the  root  structure  Into  smaller  areas  continues  just  as  long  as  all  the 
manifestations  are  accounted  for. 

This  article  Is  a  summary  account  of  the  operation  of  INTERNIST-II.  In  actuality  it  is 
more  complicated.  See  Pople  (1977)  for  a  complete  explication.  The  main  point  of  INTERNIST- 
II,  however,  is  that  it  diagnoses  a  patient's  diseases  by  dividing  the  disease  tree  into  smaller 
and  smaller  subtrees,  until  such  time  as  It  achieves  a  set  of  terminal  nodes  that  accounts  for 
all  the  manifestations. 
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Concluding  Remarks 

INTERNIST  I  and  II  have  successfully  combined  a  bottom-up  and  top-down  approach  to 
medical  diagnosis.  The  patient  data  evoke  certain  disease  hypotheses  (bottom-up)  that  are 
then  used  to  predict  (top-down)  other  manifestations  that  should  be  present  If  the 
hypothesis  is  to  be  confirmed.  The  system  is  purely  associational.  It  does  not  attempt  to 
model  any  disease  processes  but  considers  a  disease  as  a  static  category  and  diagnosis  as 
the  task  of  assigning  a  patient  to  one  or  more  categories.  INTERNIST-1  has  a  large  database, 
currently  containing  over  600  of  the  diseases  of  internal  medicine  (about  75%  complete).  It 
has  displayed  expert  performance  in  complex  cases  involving  multiple  diseases.  Pople  and 
Myers  expect  that  the  system  will  be  in  clinical  use  in  the  next  few  years. 
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C4.  Present  Illness  Program 

The  Present  Illness  Program  (PIP)  Is  being  developed  at  MIT  (Pauker  et  al.,  1976; 
Szolovits  &  Pauker,  1976;  Szolovlts  &  Pauker,  1978).  It  has  been  used  for  taking  present 
illnesses  of  patients  with  edemas  (accumulation  of  excess  fluids  In  the  body)  and  patients 
with  renal  (kidney)  disease.  Taking  a  present  illness  is  different  from  performing  a  complete 
diagnosis.  It  is  the  typical  consultation,  that  a  patient  has  with  a  general  practitioner;  the 
patient  usually  presents  a  chief  complaint  that  becomes  the  Initial  focus  of  the  consultation, 
and  only  very  low-cost  sources  of  information-such  as  patient  history,  physical  examination, 
and  routine  lab  tests— are  used  to  make  a  diagnosis.  High-cost  or  high-risk  procedures  that 
may  be  necessary  for  a  complete  diagnosis  are  not  used. 

The  medical  knowledge  In  PIP  Is  represented  as  a  network  of  frames  (see  article 
Representation.B7).  The  frames  are  centered  around  diseases,  clinical  states,  and 
physiological  states  (hereafter  called  the  "patient  situation")  and  contain  data  such  as 
typical  findings,  relationships  to  other  patient  situations,  and  rules  for  judging  how  well  a  set 
of  findings  exhibited  by  a  patient  "match"  the  situation  described  by  the  frame.  Matching  is 
the  key  strategy  in  the  diagnosis.  Diagnosis  Involves  matching  findings  to  disease  frames  and 
then  selecting  a  set  of  frames  that  cover  all  of  the  findings.  There  are,  at  present,  36 
frames  for  dealing  with  renal  disease. 

Currently,  the  program  does  not  prescribe  treatment  recommendations.  Originally  the 
system  was  written  in  CONNIVER  (Article  Al  Lengueges.03),  but  this  version  was  too  slow  and 
it  has  been  recoded  to  run  in  MACLISP. 


Representation  of  Medical  Knowledge 

The  general  medical  knowledge  In  PIP  Is  knowledge  about  diseases,  the  patient 
situation;  findings,  results  of  the  physical  examination  and  reported  symptoms;  and  the 
relationships  between  these  entitles.  This  medical  knowledge  is  organized  as  a  frame  system. 
Part  of  a  typical  frame  is  shown  In  Figure  1 . 

The  slots  In  the  frame  are  grouped  into  categories  as  shown.  The  tyhical  findings  are 
those  that  are  expected  in  a  patient  having  this  disorder.  However,  patients  with  the 
disorder  need  not  exhibit  all  of  the  typical  findings.  It  is  the  Job  of  the  matching  algorithm  to 
compute  a  "goodness  of  fit"  of  findings  to  a  frame.  Some  of  the  typical  findings  have  the 
special  status  TRIGGER.  TRIGGERS  are  key  elements  of  the  clinical  decision-making  strategy. 
A  TRIGGER  Is  a  finding  that  Is  sufficiently  strongly  related  to  a  disorder  that  presence  of  the 
disorder  in  the  patient  makes  the  PIP  system  attend  to  the  disorder  frame  as  an  active 
hypothesis.  For  example,  FACIAL  EDEMA  is  listed  above  as  a  TRIGGER  for  ACUTE 
GLOMERULONEPHRITIS,  meaning  that  PIP  will  consider  this  disease  as  an  active  hypothesis  if 
a  patient  displays  facial  edema. 
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ACUTE-GLOMERULONEPHRITIS 
Typical  Findings 

TRIGGERS  (EDEMA  with  LOCATION-FACIAL . ) 

FINDINGS  (ANOREXIA . ) 

Logical  Decision  Criteria 
IS-SUFFICIENT  (None) 

MUST-HAVE  (None) 

MUST-NOT-HAVE  (None) 

Complementary  Relations  to  Other  Frames 

CAUSED-BY  (STREPTOCOCCAL-INFECTION, ...) 

CAUSE-OF  (SODIUM-RETENTION,  ...) 

COMPLICATED-BY  (ACUTE-RENAL-FAILURE, ...) 

COMPLICATION-OF  (CELLULITIS) 

Differential  Diagnosis 

CHRONIC-HYPERTENSION  implies  CHRONIC-GLOMERULONEPHRITIS 
RECURRING-EDEMA  Implies  NEPHROTIC-SYNDROME 


Scoring 

(((PATIENT  WITH  AGE-CHILD)  ->  0.8) 

((PATIENT  WITH  AGE-MIDDLE-AGED)  ->  -.6) 

) 

(((EDEMA  with  SEVERITY  ■  not  MASSIVE)  ->  0.1) 

((EDEMA  with  SEVERITY  «  MASSIVE)  ->  -1.0) 

!  ) 

Figure  1.  Part  of  the  frame  for  acute  glomerulonephritis  (kidney  stones). 

The  logical  decision  criteria  are  rules  that  permit  the  confirmation  or  rejection  of  a 
hypothesis  on  the  basis  of  a  small  number  of  key  findings.  Findings  strongly  correlated  with  a 
disease  will  be  listed  In  the  slot  IS-SUFFICIENT.  If  any  of  these  findings  are  reported,  they 
will  be  sufficient  to  confirm  the  presence  of  the  disease. 

The  relations  between  frames  reflect  the  ways  in  which  disorders  are  related  in 
medicine.  Sometimes  disease  mechanisms  are  well  understood  and  It  Is  possible  to  say  that 
one  disorder  CAUSES  another  or  is  a  COMPLICATION-OF  another.  If  mechanisms  are  poorly 
understood,  the  disorders  may  simply  be  ASSOCIATED.  The  latter  frames  are  complementary , 
that  Is,  they  represent  disorders  that  the  patient  might  have  In  addition  to  the  initial  disorder. 
In  contrast,  the  differential  diagnosis  slots  Indicate  mutually  exclusive  disorders— the  patient 
may  have  one  of  them  and  not  the  disorder  represented  by  the  current  frame. 

The  final  slot  Indicates  how  the  findings  are  scored  for  the  disorder  represented  by  the 
frame.  This  score  indicates  the  "goodness  of  fit"  of  this  disorder  to  the  findings.  The 
statements  comprising  this  slot  are  sets  of  clauses  that  are  evaluated  in  turn.  Within  a 
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clause,  evaluation  terminates  when  one  of  the  conditions  in  the  clause  is  true;  Its  score  will 
be  used.  The  local  score  for  a  frame  is  the  sum  of  the  values  of  the  clauses,  normalized  by 
the  maximum  total  score  possible.  Thus,  1  denotes  complete  agreement,  while  arbitrarily 
large  negative  numbers  denote  complete  disagreement. 


Reasoning 

The  clinical  strategy  used  by  PIP  Is  based  on  the  manipulations  of  hypotheses  and 
findings.  Knowledge  about  findings  is  stored  separately  from  the  frame  system  since  a 
finding  can  be  applicable  to  many  frames.  A  hypothesis  is  an  instantiation  of  a  disorder 
frame.  There  are  3  kinds  of  hypotheses;  (a)  confirmed,  (b)  active,  and  (c)  semi-active. 
Hypotheses  with  ratings  (as  computed  by  the  scoring  process)  that  are  higher  than  a  preset 
threshold  are  considered  confirmed,  hypotheses.  Active  hypotheses  are  those  with  at  least  one 
confirmed  trigger  finding;  and  these  contend  for  the  focus  of  attention.  Semi-active  hypotheses 
are  the  immediate  neighbors  of  the  active  hypotheses  in  the  frame  system.  They  correspond 
to  hypotheses  that,  although  not  strong  enough  to  be  investigated,  are  "at  the  back  of  the 
consultant's  mind." 

The  consultation  begins  with  the  physician  telling  the  system  the  main  symptoms  and 
signs  of  a  patient.  The  program  then  takes  the  Initiative  and  tries  to  determine  the  validity  of 
any  active  hypotheses  by  selecting  and  asking  appropriate  questions. 

The  program  works  through  the  following  cycle: 

1.  Acquire  a  new  finding.  This  task  is  accomplished  by  asking  a  sequence  of 
questions  that  characterizes  the  finding  according  to  its  possible  descriptions. 

2.  Process  the  finding.  All  of  the  frames  where  this  finding  is  relevant  are 
located. 

3.  Update  the  list  of  active  hypotheses.  Several  kinds  of  actions  can  be  taken 
at  this  point:  Remove  an  active  hypothesis  if  the  finding  matches  a  MUST- 
NOT-HAVE  rule;  confirm  a  hypothesis  if  the  premise  of  an  IS-SUFFICIENT  rule  is 
now  true;  activate  a  hypothesis  if  the  new  finding  is  one  of  the  hypothesis 
triggers  or  if  the  finding  allows  the  premise  of  a  differential  diagnosis  link  to 
succeed;  or  revise  the  score  of  the  hypothesis  if  the  finding  matches  a 
scoring  rule.  If  a  new  hypothesis  is  activated,  then  all  of  its  immediate 
relatives  are  made  into  semi-active  hypotheses. 

4.  Select  the  next  finding  to  query.  The  highest  rated  hypothesis  becomes  the 
focus  of  attention,  and  a  question  Is  generated  for  the  next  unexplored 
finding.  If  there  are  no  hypotheses,  a  question  about  a  finding  for  the  highest 
rated  causally  related  frame  Is  asked.  Questioning  terminates  when  there  are 
no  more  active  hypotheses  or  causal  relatives  with  findings  to  be  determined. 

If  the  logical  decision  criteria  are  insufficient  to  confirm  or  deny  a  hypothesis,  the 
score  of  the  hypothesis  is  computed  by  combining  (a)  the  value  of  a  function  that  measures 
the  fit  of  observed  findings  and  typical  (expected)  findings  for  the  frame  (called  the 
matching  score),  and  (b)  the  value  of  a  function  that  is  the  ratio  of  the  number  of  findings 


* 


C4 


Present  Illness  Program 


39 


accounted  for  by  the  hypothesis  to  the  total  number  of  findings  (the  binding  score).  The 
matching  score  in  turn  consists  of  two  parts,  a  local  score  for  the  frame  (described  above) 
and  a  score  propagated  from  causally  related  frames. 


Concluding  Remarks 

Like  INTERNIST  (see  article  C3)  and  unlike  MYCIN  (article  Cl),  PIP  is  intended  to 
simulate  the  clinical  reasoning  of  physicians.  The  way  In  which  the  general  medical 
knowledge  has  been  represented  as  a  system  of  hypothesized  disorder  frames  and  clinical 
findings  reflects  this  intent,  as  do  the  strategies  used  to  select  questions  for  confirming  a 
hypothesis. 

The  system  uses  two  types  of  reasoning,  categorical  and  probabilistic.  Decisions  about 
the  applicability  of  a  hypothesis  are  determined  using  logical  decision  criteria  (the  IS- 
SUFFICIENT,  MUST-HAVE,  and  MUST-NOT-HAVE  rules)  that  a  physician  uses.  When  these  are 
insufficient,  the  probabilistic  methods  (the  computation  of  matching  scores  and  binding 
scores)  are  used.  Both  kinds  of  reasoning  feature  a  combination  of  local  and  global  decision 
strategies.  Local  strategies  decide  how  well  the  findings  fit  a  particular  frame,  while  global 
strategies  determine  how  well  a  set  of  frames  fits  the  findings. 

There  are  a  number  of  difficulties  with  the  program.  The  questioning  can  be  erratic, 
since  the  top-ranked  hypotheses  tend  to  alternate  rapidly.  This  oscillation  is  unlike  a 
physician's  line  of  reasoning,  which  tends  to  concentrate  on  questions  that  resolve  one 
hypothesis  at  a  time.  There  is  also  the  problem  of  when  to  stop  the  questioning.  The  current 
approach  is  to  stop  questioning  only  when  all  questions  about  all  possibly  relevant 
hypotheses  have  been  exhausted.  This  strategy  seems  too  conservative;  many  irrelevant 
questions  tend  to  get  asked. 
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C5.  Oigitalia  Advisors 

There  has  been  considerable  work  at  MIT  to  develop  programs  that  advise  physicians 
on  the  administration  of  the  drug  digitalis  (Silverman,  1975;  Swartout,  1977a;  and  Swartout, 
1977b).  These  programs  are  not  concerned  with  diagnosing  the  need  for  the  drug  in  a 
patient;  rather,  they  determine  an  appropriate  treatment  regimen  and  its  subsequent 
management  for  patients  known  to  require  digitalis. 

Digitalis  is  administered  to  patients  with  erratic  heartbeat  to  stabilize  the  heart  rhythm. 
The  therapeutic  effect  of  digitalis  is  achieved  by  maintaining  the  proper  amount  of  the  drug 
In  the  bloodstream.  The  body,  however,  excretes  the  drug  through  the  kidneys  and  liver. 
Furthermore,  overdoses  of  digitalis  are  toxic  and  can  cause  the  very  symptoms  that  the  drug 
Is  prescribed  to  cure.  A  typical  digitalis  regimen  consists  of  an  initial  dose  that  is  then 
modified  In  response  to  the  effects  of  the  drug  on  the  patient  and  to  the  amount  of  drug 
being  passed  by  the  kidneys. 

A  mathematical  model  of  the  effects  of  digitalis  in  the  body  has  existed  since  1967.  It 
accounts  for  the  relation  betwen  the  level  of  body  drug  stores  (as  effected  by  body  weight, 
renal  function,  etc.)  and  the  incidence  of  digitalis  toxicity.  However,  application  of  this  model 
requires  that  a  physician  adjust  the  dosages  of  digitalis  recommended  by  the  model  to  allow 
for  special  sensitivity  that  a  patient  might  have  (or  might  develop)  to  the  drug.  A  skilled 
physician  is  still  required  to  monitor  a  patient's  progress  after  the  Initial  dose  of  digitalis  is 
recommended  by  the  mathematical  model. 

More  recently,  Pauker,  Szolovits,  and  their  colleagues  at  MIT  have  developed  a  program 
that  makes  a  model  of  the  effect  of  digitalis  in  a  specific  patient  and  modifies  the  model  in 
response  to  feedback  about  the  patient  over  time  (Silverman,  1975).  Previously,  serum 
(blood)  levels  of  digitalis  had  been  used  to  provide  feedback,  but  they  proved  unsatisfactory 
alone,  and  now  clinical  signs  (e.g.  nausea,  or  Increased  heartbeat  irregularity)  are  used  to 
assess  whether  the  patient  is  responding  well  to  digitalis,  suffering  from  the  toxic  effects  of 
the  drug,  or  having  no  reaction  at  all. 

The  combination  of  a  general,  mathematical  model  of  the  effects  of  digitalis  and  a 
patient-specific  model  that  incorporates  clinical  data  In  a  continuous  feedback  cycle  has 
resulted  In  a  Digitalis  Advisor  system  which  performs  comparably  to  expert  cardiologists  in 
the  prescription  and  maintenance  of  digitalis  therapy. 

The  Digitalis  Advisory  system  operates  as  follows: 

1 .  interact  with  a  physician  to  build  a  model  of  a  specific  patient.  This  interaction 
involves  asking  about  the  heart  condition  that  digitalis  will  regulate,  about  the 
condition  of  the  kidneys,  etc. 

2.  Prescribe  a  daily  dosage  level  on  the  basis  of  the  patient's  height  and  weight 
and  the  information  gathered  in  (1). 

3.  Follow-up  sessions  assess  the  patient's  response  to  the  dosage 
recommendations  in  (2)  and  modify  the  dosage  if  necessary. 


In  1977  this  program  was  augmented  with  the  ability  to  make  explanations,  but  for  the  sake 
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of  brevity  (and  because  Article  B  Is  devoted  to  this  issue)  we  will  not  discuss  this  capability 
here.  See  Swartout  (1977a)  for  a  complete  discussion  of  the  explanation  facility. 


The  Patient-specific  Model 


What  follows  Is  part  of  an  annotated  transcript  with  the  Digitalis  Advisor,  taken  from 
Swartout  (1977a). 


What  Is  the  age  of  the  patient? 

Enter  numerical  value  **  66 

.  % 

What  Is  the  weight  of  the  patient? 

Units:  Kg 

Enter  numerical  value  **  72 

What  is  the  sex  of  the  patient? 

The  possibilities  are: 

1. Male 

2.  Female 

Enter  single  value  **  MALE 

What  Is  the  cardiac  rhythm? 

The  possibilities  are: 

1  .Normal  sinus  rhythm 

2. Atrial  fibrillation 

3. Atrlal  flutter 

4. Pat 

5.  Pat  with  block 

6. Non-paroxysmal  Junctional  tachycardia 
Enter  single  value  **  2 

So  far  the  system  Is  determining  what  It  needs  to  know  to  make  a  patient-specific  model. 
Later,  It  will  ask  why  the  patient  Is  being  put  on  digitalis,  whether  the  patient  Is 
currently  using  the  drug,  etc. 

Is  there  regularization  of  ventricular  rhythm? 

The  possibilities  are: 

1.  Present 

2. Absent 

Enter  single  value  **  WHY? 

The  user  of  the  system  wants  an  explanation  of  what  motivates  a  question.  The  system 
responds  by  converting  the  goals  on  its  goal  stack  to  English,  to  produce  the  first  two 
sentences  of  the  explanation.  The  system  then  finds  and  displays  a  canned  explanation 
which  is  associated  with  the  concept  that  is  being  questioned.  The  canned  explanation 
usually  provides  information  that  could  not  be  deduced  from  the  code  of  the  Advisor: 

My  top  level  goal  Is  to  begin  therapy.  I  am  now  trying  to  check 
the  cardiac  rhythm.  If  the  cardiac  rhythm  is  atrial  fibrillation 
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and  there  is  ventricular-rhythm  regularization  this  suggests  that 
the  pacemaking  impulses  from  the  ventricles  are  no  longer  coming 
from  the  atria.  If  the  status  of  ventricular-rhythm  regularization 
is  present  it  is  possible  there  may  be  non-paroxysmal  junctional 
tachycardia  which  is  a  sign  of  digitalis  toxicity. 

The  system  continues  to  query  the  user  so  that  it  can  build  a  patient-specific  model. 
Eventually  it  arrives  at  the  following  point: 

How  long  do  you  wish  to  take  to  reach  full  digitalization? 

The  possibilities  are: 

1. Instantaneous  rate  (1-2  hours) 

2. Rapid  rate  (1  day) 

3. Moderate  rate  (2-3  days) 

4. Slow  rate  (4-7  days) 

Enter  single  value  **  3 

Parameters  used  to  make  recommendations: 

Body  store  goal:  0.544Mg 
Projected  amount  on  board:  O.OOOMg 
Body  stores  shift:  0.644Mg 
Remaining  time  to  reach  goal:  48.0  hours 
Half-life:  43.7  hours  1.8  days 
Daily  loss  at  equilibrium:  0.176  Mg 
Absorbtion  factor:  0.76 

T he  system  then  asks  how  many  times  a  day  the  user  wishes  to  administer  digitalis  and, 
on  the  basis  of  this  information,  makes  dosage  recommendations. 

The  next  interaction  with  the  Advisor  comes  in  a  follow-up  session.  The  point  of  this 
session  Is  for  the  Advisor  to  determine  whether  the  patient  shows  any  therapeutic  effect  of 
the  drug  or  whether  s/he  displays  any  signs  of  toxicity,  and  to  adjust  the  dosage 
accordingly.  There  are  three  levels  of  therapeutic  effect:  None,  Partial,  and  Complete. 
Similarly,  there  are  three  levels  of  toxicity:  None,  Possible,  and  Definite.  There  are  therefore 
nine  therapeutic/toxic  states,  and  each  has  an  associated  set  of  recommendations 
associated  with  It.  Digitalis  toxicity  is  Identified  by  five  different  types  of  signs  and 
symptoms,  Including  non-cardiac  signs  (nausea,  etc.),  and  direct  cardiac  signs  of  toxicity 
(e.g.,  an  Increase  of  over  20%  in  the  number  of  premature  ventricular  contractions).  If  any 
cardiac  manifestations  are  present,  the  patient  is  considered  definitely  toxic;  the  category 
"Possibly  Toxic"  is  indicated  by  various  combinations  of  signs  and  symptoms  from  classes 
other  than  the  cardiac  signs. 

We  will  not  consider  the  follow-up  session  In  detail  here.  See  Swartout,  1977a  for  a 
complete  transcript. 


Concluding  Remarks 

The  performance  of  the  Digitalis  Advisor  reported  by  (Gorry,  Silverman,  and  Pauker, 
1978)  suggests  that  the  advisor  can  perform  at  least  as  well  as  physicians  in  the 
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prescription  and  monitoring  of  digitalis  therapy.  In  particular,  the  Advisor  was  used  to  make 
recommendations  about  therapy  for  a  group  of  patients  who  were  under  the  care  of  house 
staff  in  a  hospital,  and  it  performed  at  least  as  well  as  the  staff,  who  were  themselves  under 
the  direction  of  an  attending  physician. 


References 

See  Gorry,  Silverman,  and  Pauker  (1978)  and  Silverman  (1976).  The  explanation 
capability  is  described  by  Swartout  (1977a). 
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C6.  IRIS 

The  design  goals  for  IRIS  (Trigoboff  &  Kulikowski,  1977;  Trigoboff,  1978)  are  different 
from  those  of  the  other  consultation  systems  constructed  to  be  expert  clinical  decision¬ 
making  systems  In  a  particular  medical  domain.  IRIS  was  designed  to  be  a  tool  for  building  and 
experimenting  with  such  systems.  Developed  at  Rutgers  University  and  written  in  INTERLISP, 
it  was  designed  to  permit  easy  experimentation  with  alternative  representations  of  general 
medical  knowledge,  clinical  strategies,  and  modes  of  interaction.  It  was  designed  to  be  used 
by  a  computer  specialist  in  collaboration  with  a  domain  expert.  A  consultation  system  for 
glaucoma  has  been  developed  using  IRIS. 

IBIS  uses  a  combination  of  two  well-established  representation  formalisms  for 
representing*  knowledge,  semantic  nets  and  production  rules  (see  articles  Representation.Ba  and 
Representation.B3).  The  semantic  net  consists  of  nodes  representing  patient  Information  and 
uses  a  large  and  extendable  set  of  link  types  for  associating  this  medical  knowledge.  A  set  of 
production  rules  is  associated  with  each  link  of  the  network.  The  transmission  of  information 
between  nodes  of  the  semantic  network  Is  controlled  by  the  production  rules.  This  process  is 
called  propagation  and  is  the  basis  of  any  clinical  strategy  implemented  in  IRIS. 


Representation  of  Medical  Knowledge 

As  with  the  other  medical  consultation  systems,  IRIS  makes  a  (very  sharp)  distinction 
between  general  medical  knowledge  and  any  patient-specific  knowledge.  The  general 
medical  knowledge  is  represented  partly  as  a  semantic  net  and  partly  as  production  rules. 
The  nodes  of  the  net  represent  clinical  concepts  such  as  pathophysiological  states, 
diseases,  symptoms,  findings,  treatments,  etc.  Examples  of  nodes  in  the  glaucoma 
application  are  OPEN  ANGLE  GLAUCOMA,  SCOTOMA,  PILOCARPINE  THERAPY.  The  links 
represent  relations  between  the  nodes— e.g.,  CAUSES,  TREATMENT-FOR,  SYMPTOM-OF, 
ASSOCIATED- WITH. 

The  patient-specific  knowledge  gathered  during  a  consultation  is  represented  as  a  set 
of  knowledge  structures  called  "Information  SPECiflcations"  (ISPECs).  ISPECs  are 
associated  with  nodes  of  the  semantic  net  and  are  created,  deleted,  and  modified  during  the 
course  of  the  consultation.  An  ISPEC  is  an  assertion  about  the  patient  and  is  essentially  a 
frame  (see  article  Repreaentation.B7)  with  the  following  slots: 

NODE  -  The  name  of  the  associated  node  in  the  semantic  net.  The  node 
represents  the  concept  being  asserted  about  this  patient. 

SIDE  -  This  slot  Indicates  the  half  of  the  body  to  which  this  ISPEC  refers.  Its 
possible  values  are  LEFT,  RIGHT,  or  NIL.  Some  nodes  in  the  net  will  be 
applicable  to  a  left  organ  and  a  right  organ  (e.g.,  eye)  while  others  are  not 
(e.g.,  headache,  diabetes).  The  use  of  SIDE  provides  an  economical 
representation,  since  many  nodes  might  otherwise  be  duplicated  in  the  net. 

MB  -  This  slot  is  a  measure  of  belief  that  reflects  the  degree  of  system  belief  in 
the  assertion  represented  by  the  ISPEC.  Any  numerical  method  of 
representing  degree*  of  certainty  can  be  implemented  here.  In  the 
glaucoma  application,  the  confidence  factor  mechanism  of  MYCIN  (see 
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article  Cl)  has  been  implemented.  The  MB  Is  a  pair  of  numbers:  SB  (strength 
of  belief)  and  SD  (strength  of  disbelief).  The  actual  MB  Is  the  difference 
between  these  two  numbers  and  ranges  from  total  belief  to  total  disbelief 

TIME  -  The  time  slot  is  a  list  of  two  dates,  the  date  the  ISPEC  became  true  of  the 
patient  and  the  date  the  ISPEC  ceased  to  be  true.  The  system  can  also 
work  with  a  "coarser"  view  of  time:  PAST,  PAST-OR-PRESENT,  and  FUTURE. 
This  time  representation  Is  part  of  the  mechanism  for  dealing  with  multiple 
visits  and  for  following  a  patient  through  a  given  course  of  therapy. 

MOOIFIERS  -  These  are  further  specifications  and  qualifications  of  the  basic 
ISPEC.  Examples  of  modifiers  are  VALUE,  DEGREE,  COLOR,  and  WIDTH.  These 
modifiers  do  not  appear  in  all  ISPECs,  but  only  in  those  to  which  they  are 
applicable.  These  modifiers  allow  further  patient-specific  specifications  of 
the  concept  in  the  semantic  net.  For  example,  "severely  increased 
intraocular  pressure"  is  represented  as  an  ISPEC  for  INCREASED 
INTRAOCULAR  PRESSURE  with  modifier,  DEGREE:  SEVERE. 

TYPE  -  The  type  slot  of  an  ISPEC  determines  the  way  In  which  It  is  interpreted. 
An  arbitrary  number  of  types  is  possible.  Currently  Implemented  TYPES  are 
NIL  (the  standard  and  default),  FAMILYHISTORY,  TATIENTHISTORY,  and  a 
number  of  others  that  are  used  by  the  diagnosis  strategy— -CHOSEN, 
COVERED-BY,  SUBSUMED-BY,  and  TREATED-BY. 

The  statement  "The  pressure  is  10  in  the  right  eye"  is  equivalent  to  the  ISPEC: 

NODE  =  INTRAOCULAR  PRESSURE 
SIDE  =  RIGHT 
MB  *(1,0) 

TIME  =  PRESENT 
MODS  *  VALUE:  10 
TYPE  *  NIL 


Reasoning 

IRIS  makes  no  committment  to  any  particular  strategy  of  question  selection.  Currently 
a  "questionnaire"  strategy  has  been  Implemented.  At  the  beginning  of  a  consultation  the 
program  runs  through  a  set  of  questions  and  the  user  answers  them. 

In  the  applications  o<  IRIS  where  consultation  and  diagnosis  are  the  goal,  ISPECs  are 
associated  first  with  the  S9t  of  symptoms  displayed  by  the  patient.  In  IRIS's  knowledge 
base,  symptom  nodes  are  Inked  to,  among  other  things,  disease  nodes.  Thus,  a  set  of 
disease  nodes  can  be  activated  by  the  symptoms;  a  disease  node  is  said  to  explain  the 
symptom  nodes  that  characterize  It.  Disease  nodes  are  also  linked  to  treatment  nodes,  and 
when  IRIS  has  determined  wulch  disease(s)  holds  for  a  patient,  It  will  activate  the 
appropriate  (linked)  treatment  nodes. 


The  process  of  nodes  evokng  each  other  in  IRIS  is  called  propagation  of  ISPECs, 
because  an  ISPEC  Is  associated  with  a  symptom,  disease,  or  treatment  node  relevant  to  a 
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patient.  When  symptoms  evoke  a  disease  or  when  a  disease  evokes  a  treatment,  an  ISPEC  is 
created.  This  propagation  of  information  and  generation  of  inferences  between  any  linked 
nodes  in  the  semantic  net  is  controlled  by  a  set  of  production  rules  associated  with  the  link. 
If  the  ISPECs  associated  with  the  node  at  the  tail  of  the  link  satisfy  the  precondition  pattern 
of  a  rule,  then  the  actions  specified  by  the  rule  will  be  performed  at  the  node  at  the  head  of 
the  link.  Typical  actions  include  the  creation  or  deletion  of  ISPECs  and  the  modification  of 
MBs.  Thus,  IRIS  uses  a  forward-chaining  reasoning  process. 

An  important  propagation  pattern  is  that  of  the  "propagation  cone."  Consider  the  rule: 
if  SYMPTOM  1  and  SYMPT0M2  and  SYMPT0M3  then  DISEASE  1 


In  the  semantic  net,  the  nodes  in  this  rule  would  be  represented  as  follows: 


DISEASE 1 


CHARACTERIZES 

YM3 


Clearly,  an  ISPEC  should  only  propagate  to  DISEASE1  if  all  three  symptoms  are  present.  In 
the  case  depicted  above,  propagation  should  be  from  the  base  of  the  "cone"  to  the  "apex." 
This  propagation  pattern  is  achieved  by  associating  the  same  decision  table  with  all  three 
CHARACTERIZES  links  (essentially  AND-lng  SYM1,  SYM2,  and  SYM3  into  one  production  to 
insure  that  ALL  symptoms  are  present  before  a  disease  node  is  evoked).  In  some  cases  the 
direction  of  propagation  will  be  from  apex  to  base;  for  example,  when  propagating 
"COVERED-BY"  ISPECS  from  a  treatment  node  to  each  of  the  diseases  it  treats. 


The  production  rules  are  encoded  as  decision  tables  to  make  their  execution  more 
efficient.  Consider  the  following  set  of  production  rules: 


R1 :  if  A  and  B  then  D 

R2:  If  B  and  (not  C)  then  (not  E) 

R3:  if  A  and  B  and  (not  C)  then  F 
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In  evaluating  these  rules,  A  and  C  are  evaluated  twice  and  B  la  evaluated  three  times. 
A  decision  table  encoding  these  three  rules  is: 


R1 

R2 

R3 

A 

♦ 

♦ 

B 

♦ 

♦ 

♦ 

C 

_ 

- 

- 

a  *  *  * 


A  column  of  the  decision  table  corresponds  to  a  rule.  A  condition  Is  evaluated  only  once,  and 
the  result  Is  used  in  each  applicable  column. 

The  IRIS  claim  is  that  any  clinical  strategy  can  be  Implemented  using  the  available 
medical  primitives.  In  fact,  the  propagation  of  weights  In  CASNET,  therapy  selection  in 
MYCIN,  and  the  formation  of  composite  hypotheses  In  INTERNIST-II  were  Implemented  with 
very  little  effort  (Trlgoboff,  1978). 


Clinical  strategy  of  IRIS  for  glaucoma  diagnosis 

The  clinical  strategy  for  the  glaucoma  application  is  implemented  via  a  set  of  6  special 
nodes  In  the  semantic  net:  CHOSEN-DIAGNOSIS,  CHOSEN-TREATMENT,  POSSIBLE-DIAGNOSIS, 
POSSIBLE-TREATMENT,  UNEXPLAINED-SYMPTOM,  and  UNTREATED-PATHOLOGY.  The  goal  of  the 
consultation  Is  (a)  to  have  one  or  more  ISPECs  associated  with  the  nodes  CHOSEN- 
DIAGNOSIS  and  CHOSEN-TREATMENT,  and  (b)  to  have  all  ISPECs  associated  with 
UNEXPLAINED-SYMPTOMS  and  UNTREATED-PATHOLOGY  be  TYPE-COVERED-BY.  As  findings  are 
entered,  they  propagate  ISPECs  to  the  node  UNEXPLAINED-SYMPTOMS.  Propagation  across 
SYMPTOM-OF  links  will  result  in  ISPECs  with  varying  CFs  (confidence  factors),  associated 
with  a  number  of  disease  nodes.  Any  disease  with  a  high  enough  CF  will  propagate  an  ISPEC 
to  the  node  POSSIBLE-DIAGNOSIS.  After  all  data  has  been  entered,  the  diseases  associated 
with  POSSIBLE-DIAGNOSIS  are  then  investigated  In  turn.  Each  diagnosis  temporarily  receives 
TYPE=CHOSEN,  and  TYPE=COVERED-BY  propagates  to  each  symptom  explained  by  this 
disease.  The  number  of  explained  symptoms  is  used  as  a  measure  of  the  explanatory  power 
of  a  disease.  This  process,  of  temporary  assignment,  is  repeated  for  each  possible 
diagnosis;  and  the  disease  that  explains  the  most  symptoms  is  given  a  permanent 
TYPEbCHOSEN.  If  there  are  any  unexplained  symptoms,  the  process  is  repeated. 

A  similar  strategy  using  the  nodes  POSSIBLE-TREATMENT,  CHOSEN-TREATMENT,  and 
*  UNTREATED-PATHOLOGY  Is  used  to  select  treatments. 
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Concluding  Remarks 

IRIS  has  been  explained  in  the  context  of  Its  glaucoma  application,  but  it  was  designed 
to  represent  medical  knowledge  from  ANY  domain  and  to  implement  a  variety  of  clinical 
strategies.  (Recall  that  aspects  of  CASNET,  MYCIN,  and  INTERNIST-II  have  all  been 
implemented  in  IRIS.) 

This  generality  Is  feasible  because  the  representation  of  knowledge  is  itself  very 
general  (augmented  semantic  nets).  In  principle,  knowledge  from  any  (medical  or  nonmedical) 
domain  can  be  represented.  A  second  characteristic  of  IRIS  that  makes  It  very  general  is 
the  separation  of  clinical  strategy,  both  conceptually  and  operationally,  from  medical 
knowledge.  Note  that  to  Implement  the  "consultation"  strategy,  IRIS  needed  to  "know  about" 
only  six  nodes  in  the  knowledge  base:  chosen  diagnosis,  chosen  treatment,  possible 
diagnosis,  possible  treatment,  unexplained  symptom,  and  untreated  pathology.  These  six 
concepts  are  inherent  to  the  clinical  strategy  of  consultation;  every  other  node  in  the 
knowledge  base  is  conceptually  and  operationally  independent  of  the  implementation  of  the 
clinical  strategy. 


References 


See  Trigoboff  &  Kulikowski  (19 77)  and  Trlgoboff  (1978). 


Reference* 


49 


References 


AIM  Workshop  Proceedings.  Dept,  of  Computer  Science,  Rutgers  University.  Held  annually, 
1976-78. 

Clancey,  W.  Tutoring  rules  for  guiding  a  case  method  dialogue.  International  Journal  of 
Man-Machine  Studies,  1979,  11,  26-49. 

Croft,  J.  Is  Computerized  Diagnosis  Possible?  Computers  and  Biomedical  Research,  1972, 
6(4),  361-367. 

Davis,  R.  Applications  of  Meta-Level  Knowledge  to  the  Construction,  Maintenance  and 
Use  of  Large  Knowledge  Bases,  Stanford  Al  Lab  Memo  AIM-283,  Al  Lab,  Stanford 
University,  1976. 

Davis,  R.  Interactive  transfer  of  expertise:  Acquisition  of  new  inference  rules.  IJCAI  6, 
1977,  321-328. 

Davis,  R.  Knowledge  acquisition  in  rule-based  systems:  Knowledge  about 
representations  as  a  basis  for  system  construction  and  maintenance.  In  D.  Waterman  & 
F.  Hayes-Roth  (Eds.),  Pattern-directed  inference  Systems.  New  York:  Academic 
Press,  1978.  Pp.  99-134. 

Davis,  R.,  &  Buchanan,  B.  Meta-level  knowledge:  Overview  and  Applications.  IJCAI  6,  197  7, 
920-928. 

Davis,  R.,  Buchanan,  B.,  &  Shortliffe,  E.  H.  Production  Rules  as  a  Representation  for  a 
Knowiedge-base  Consultation  Program.  Journal  of  Artificial  Intelligence,  1977,  8(1), 
1 6-46. 


Feigenbaum,  E.  A.  The  art  of  artificial  intelligence:  Themes  and  case  studies  in  knowledge 
engineering.  IJCAI  6,  1977,  1014-1029. 

Feinstein,  A.  Clinical  Judgment.  Baltimore:  William  &  Wilkins,  1967. 

1 

Gorry,  A., .  &  Barnett,  0.  Sequential  diagnosis  by  computer.  Journal  of  the  American 
Medical  Association,  1968,  206,  849-864. 

Gorry,  G.  A.,  Silverman,  H.,  and  Pauker,  S.  G.  Capturing  clinical  expertise:  A  computer  program 
that  consdlers  clinical  response  to  digitalis.  American  J.  of  Medicine,  1978,  64,  462- 
460. 

Helser,  J.  A  computerized  Psychopharmacoiogy  Advisor.  HEAD-MED  Report  in  the  SUMEX 
Annual  Report.  Computer  Science  Dept.,  Stanford  University,  1977-1978. 

Jacquez,  J.  A.  The  Diagnostic  Process.  Ann  Arbor,  Mich.:  Mallory  Lithography,  1 964. 

Ledley,  R.,  &  Lusted,  L.  Reasoning  foundations  of  medical  diagnosis.  Science,  1959, 
130(3366),  9-21. 


60 


Applications-oriented  Al  Research:  Medicine 


Nordyke,  R.,  Kulikowski,  C.  A.,  &  Kulikowski,  C.  W.  A  Comparison  of  Methods  for  the 
Automated  Diagnosis  of  Thyroid  Dysfunction.  Computers  end  Biomedical  Research, 
1971,  4(4),  374-389. 

Pauker,  S.,  Gorry,  A.,  Kassirer,  J.,  &  Schwartz,  W.  Towards  the  Simulation  of  Clinical 
Cognition— Taking  a  Present  illness  by  Computer.  American  Journal  of  Medicine,  June 
1976,  60,  981-996. 

Pople,  H.  E.  The  DIALOG  Model  of  Diagnostic  Logic  and  Its  Use  In  Internal  Medicine.  IJCAi  4, 
Tbilisi,  USSR,  1976. 

Pople,  H.  E.  The  formation  of  composite  hypotheses  in  diagnostic  problem  solving— an 
exercise  in  synthetic  reasoning.  IJCAI  6,  1977,  1030-1037. 

Safrans,  C.,  Desforges,  J.,  &  Tsichlis,  P.  Diagnostic  Planning  and  Cancer  Management, 
MIT/LCS/TR-1 69,  MIT,  1976. 

Shortliffe,  E.  H.  Computer-Based  Medical  Consultations:  MYCIN.  New  York:  Elsevier, 
1976. 

Silverman,  H.  A  Digitalis  Therapy  Advisor,  MAC  TR-143,  Computer  Science  Dept.,  Mil. 
1976. 

Sridharan,  N.  S.  Journal  of  Al:  Special  issue  on  applications  in  the  sciences  and  medicine, 
1978,  11(1,2),  1-196. 

Swartout,  W.  A  Digitalis  Therapy  Advisor  with  Explanations,  MAC  TR-176,  Computer 
Science  Dept.,  MIT,  1977.  (a) 

Swartout,  W.  A  Digitalis  Therapy  Advisor  with  Explanations.  IJCAI  5,  1977,  819-825.  (b) 

Szolovits,  P.,  &  Pauker,  S.  Research  on  a  Medical  Consultation  Program  for  Taking  the 
Present  Illness.  Proe.  3rd  Illinois  Conf.  on  Medical  Information  Systems,  November 
1976. 

Szolovits,  P.,  &  Pauker,  S.  Categorical  and  Probabilistic  Reasoning  in  Medical  Diagnosis. 
Journal  of  Artificial  Intelligence,  1978,  10.  In  press. 

Trigoboff,  M.  IRIS:  A  Framework  for  the  construction  of  Clinical  Consultation 
Systems.  Doctoral  dissertation,  Dept,  of  Computer  Science,  Rutgers  University,  1978. 

Trigoboff,  M.,  &  Kulikowski,  C.  IRIS:  A  System  for  the  Propagation  of  Inferences  in  a 
Semantic  Net.  IJCAI  6,  1977,  274-280. 

Weiss,  $.,  Kulikowski,  C.,  &  Safir,  A.  A  Model-Based  Consultation  System  for  the  Long-Term 
Management  of  Glaucoma.  IJCAI  5,  1977,  826-832. 

Weiss,  S.,  Kulikowski,  C.,  &  Safir,  A.  A  Model-Based  Method  for  Computer-Aided  Medical 
Decision-Making.  Journal  of  Al,  1978,  11(1 ,2),  1 46- 1 72. 


Index 


51 


I 


* 


Index 


action  clause  21 
active  hypotheses  38 
active  hypothesis  36 
AND/OR  tree  10,13,24 
ANNA  40 

antimicrobial  therapy  19-27 
associations,  INTERNIST  32 
associative  triple  22 
attribute  22,  24 
attribute-value  24 
augmented  links  44 


backward  chaining  7,  12,  23 
binding  score  38 
bottom-up  approach  30,  36 


CASNET  3,  27-31,47 
CASNET/GLAUCOMA  27-31 
categorical  reasoning  39 
causal  disease  pathway  29 
causal  model  27-31 
causal  network  27-31 
certainty  factor  24 
certainty  factor,  MYCIN  22 
CF,  certainty  factors  22,  23 
classification  tables  28,  29 
clause  22 
clinical  reasoning  4 
clinical  strategy  1,33-34 
clinical  strategy,  IRIS  45-49 
complementary  frames  37 
conclusion  24 

confidence  factor  22,  28,  47 
confirmed  hypotheses  38 
confirmed  states  29 
CONNIVER  36 
constrictor  relation  34 
consultation  systems  1 
cost  28,  33 


decision  criteria  37 
decision  tables  46 
denied  states  29 
depth-first  search  23 
diagnostic  reasoning  1,  33-34 
diagnostic  strategy  1 
differential  diagnosis  37 
Digitalis  Advisor  40-44 
disease  area  32,  34 
disease  category  27-31 
disease  entity  32 
disease  hypotheses  31 
disease  model,  INTERNIST  33 
disease  node  32 
disease  tree  31 
disorder  frame  36 
dynamic  disease  process  27,  29 


EMYCIN  6 
EVOKE  relation  32 
explanation  8,  9-10,  14,  40 
explanatory  diagnosis  power  47 


findings  36 
findings,  PIP  36 
forward-chaining  46 
frame  relations,  PIP  37 
frame  system  36 
frames  9,  36-39,  44 


glaucoma  27,  44 

glaucoma  consultation  system  44 
goal  tree  10,13 
goodness  of  fit,  PIP  36,  37 
Gorry,  G.  A.  40 


HEADMED  3 
HODGKINS  3 
hypotheses,  active  36 
hypotheses,  semi-active  38 


Davis,  Randall  7 


62 


Applications-oriented  Al  Research:  Medicine 


hypothesis  confirmation  36 
hypothesis  formation  3,  31 
hypothesis  rejection  36 
hypothesis  status,  CASNET  29 
hypothesis  status, PIP  38 


IMPORT  property  33 
inexact  inferences  22 
inexact  knowledge  3 
inexact  reasoning  1,3,23,31 
infectious  disease  consultant  system  1 9- 
27 

inference  44 
inference  rules  23 
inferential  rules  23 
INTERLISP  44 

Internal  medicine  consultation  program  31 

INTERNIST  31,39 

INTERNIST-11  34,  47 

IRIS  3,  44-49 

ISPEC  44-49 


judgemental  reasoning  2 
justification  6 


knowledge  acquisition  7,  10,  16,  26 
knowledge  engineering  1 
Kulikowski,  C.  27,44 


link  types  44 
LISP  21 


MACLISP  36,  40 
man/machine  interactions  26 
MANIFEST  relation  32 
manifestations  31 
matching  36 
matching  score  36 
medical  applications  1 


medical  consultant  systems  1 
medical  decision  making  1 
medical  diagnosis  systems  1 
meta-knowledge  6,  10 
meta-rule  7,  1 1 

model,  diagnostic  reasoning  (INTERNIST)  31 
MYCIN  3,  7,  10,  11,  19-27,  39,  47 
MYCIN,  Sample  dialogue  19-21 
Myers,  J.  31 


natural  language,  MYCIN  26 


object  22 
OWL  6 


pathway  29 

patient-specific  model  40 
Pauker,  S.  36,  40 
PIP  36-39 

planes  of  knowledge  28 
plausible  reasoning  1,3,33,34 
Pople,  H.  31 
predicate  function  22 
preference  categories  26 
premise  24 
premise  clause  21 
Present  Illness  Program  36-39 
probabilistic  reasoning  39 
production  rules  7,21,44 
production  system  1 9-27 
productions,  MYCIN  26 
propagation  44 
propagation  of  ISPECs  46 
propagation,  IRIS  44 
PUFF  3,6 


representation  of  medical  knowledge  1 ,  28, 
31-33,  34,  44 

representation,  clinical  strategies  44 
root  structure  34 


Index 


63 


rule  model  10,  15 


Weiss,  S.  27 


Safir,  A.  27 

schema  10 

scoring  33 

semantic  grammar  25 

semantic  net  44 

sequential  diagnosis  programs  3 

sequential  processing,  INTERNIST  34 

shelf  33 

Shortliffe,  Edward  1 9 
Silverman,  Howard  40 
simulation  of  clinical  reasoning  39 
status  29 

status,  of  hypothesis  in  CASNET  29 

status,  of  state  29 

strategies  47 

strategy  7,  1 1 

strategy,  reasoning  29 

supports  28 

Swartout,  William  40 

Szolovits,  Peter  36,  40 


TEIRESIAS  7-19,25 
theory  formation  31 
therapy  selection  25 
thresholding  4 
top-down  approach  35 
transfer  of  expertise  7 
treatment  regimen  system  40 
TRIGGER  key  elements  36 
triggering  hypotheses  38 
Trigoboff,  Michael  44 
TYPE  property  33 


undetermined  states  29 
unity  path  24 


validation  5 
value  22,  24 


